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Abstract 

The problems of decoherence and destructive measurement make 
quantum computers especially unstable to errors. We begin by ex- 
ploring Kitaev's framework for quantum error correcting codes that 
exploit the inherent stability of certain topological quantum numbers. 
Analytic estimates and numerical simulation allow us to quantita- 
tively assess the conditions under which this stability persists. In 
order to allow quantum computations to be performed on data stored 
in these and other unconcatenated codes, we present a probabilistic 
algorithm for executing an encoded Toffoli gate via preparation of a 
certain three-qubit entangled state as a computational resource. The 
algorithm works to iteratively purify an initial input state, conditional 
on favorable measurement outcomes at each stage in the iteration. 

Analogous iterative purification methods are identified in the con- 
text of classical algorithms for simulating quantum dynamics. Here 
an ensemble of stochastic trajectòries over a classical configuration 
space is iteratively driven toward a fixed point that yields information 
about a quantum evolution in an associated Hilbert space. We test 
this procedure for a restricted subspace of the Heisenberg ferromagnet. 

The trajectory ensemble concept is then applied to the problem 
of mechanism identification in closed-loop quantum control, where a 
precise definition of mechanism in terms of these trajectòries leads to 
methods for extracting dynamical information directly from the kinds 
of optimization procedures already realized in control experiments on 
optical-molecular systems. We demonstrate this on simulated data for 
controlled population transfer in a seven level molecular system under 
shaped-pulse làser excitation. 
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Chapter 1 

Elements of Quantum 
Information 

1.1 Universal Quantum Simulators 

It is a bromide among particle physicists that the Standard model contains (or 
that some still undefmed string/M theory may someday contain) all the rest 
of physics at lower energy scales and that the problem is just the difficulty of 
actually computing its implications on these scales. It is a bromide among 
condensed matter physicists that chemistry and materials science are likewise 
reducible to one large iV-particle Schrodinger equation. And it is a bromide 
among chemists that biology is reducible in some such way to chemistry. 

But why are even the most sophisticated modern computers unable to 
seriously penetrate these age-old disciplinary boundaries? It appears that the 
main obstacle to exploiting this reductionism is native to quantum mechanics, 
and it is not so subtle. Consider how a computer could represent a purely 
classical, physical system. Assuming a discretized state space so that for 
N distinguishable partides each may occupy one of S possible states, the 
total amount of information necessary for a computer to store one particular 
configuration of the system is just N\og 2 S bits. One particular state of the 
corresponding quantum system, however, will require for its description a set 
of S N complex numbers, one for every possible classical state. This means 
the number of bits necessary for a computer to store this description will 
grow exponentially, not linearly, with N. As N gets large, not only does 
it become computationally intractable to predict the future behavior of the 
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system — it is intractable even to fully specify its present state. 

Something here is not quite right, though. The desired result of a given 
quantum simulation may not necessarily include a complete quantum descrip- 
tion of some final state. The heat capacity of a metal for instance — while 
requiring a quantum mechanical treatment in certain circumstances — is still 
just one number. It is only the intermediate stages in the calculation which 
presumably require consideration of these intractable quantum state vectors. 
But perhaps there are alternative representations which obviate such huge 
memory sinks. Indeed there are, and they succeed in eliminating the expo- 
nential blow-up of storage and computation costs for certain kinds of systems. 
However, no such techniques exist that work on a more general level, and 
their effectiveness at particular problems is something of an open question 
to be probed one problem at a time. 

This leads to a more pessimistic appraisal: if ab initio computation is 
in general so expensive, why bother with it at all? If one is interested in 
the behavior of a genèric quantum system in the laboratory, perhaps it will 
always be easier to perform actual experiments on the system itself. Let the 
system perform the computations for you! Indeed the physical system may 
be regarded as a kind of ultra-special purpose computer, fit for exactly one 
problem. Perhaps one is more ambitious, however, and might attempt to use 
one physical system that is especially accessible in the laboratory in order to 
simulate some other kind of system entirely. For instance there are proposals 
to study gravitational systems by resort to analogies in condensed matter 
PP, whereby the material relations for certain liquids are seen to fortuitously 
mimic some aspects of the Einstein field equations. 

Even bolder would be to imagine that certain quantum systems, by virtue 
of some special types of interactions, are able to simulate large classes of 
quantum systems in a way which is less fortuitous and more by design. In 
essence, exactly this is what is going on in a classical computer when it 
simulates other classical systems. There is no real physical similarity between 
a silicon microchip and the turbulent flow of nitrogen gas which it simulates 
via a fluid dynamics algorithm. Why can't such a computational universality 
exist in the quantum domain as well? 

This was the motivation of Feynman when he proposed the idea of a 
quantum computer as a universal quantum simulator |2j. 

A sufliciently well controlled quantum mechanical system that can be 
initialized in a chosen state, evolved by tunable interactions, and measured 
with high fidelity deserves the title quantum computer if these interactions 
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are sufficiently universal as to mimic a large class of other quantum systems 
bearing little physical relationship to that of the computer. The crux of 
quantum computation is in this notion of computational universality. The 
difference between a liquid Helium system mimicking the quantum fluctua- 
tions of space-time and a quantum computer mimicking, say, the molecular 
dynamics of wàter is that tomorrow the quantum computer may be used to 
simulate something entirely different; whereas, the liquid Helium system will 
always be stuck on the same physical problem of quantum gravity. 

As opposed to the Helium system, which achieves its mimickery through 
a kind of mathematical coincidence, the quantum computer works (rather, 
would work, if ever one were built) by detailed independent control of many 
many individualized components. The bàsic unit of the quantum computer 
is the quantum bit (or qubít) 1 . One qubit is a quantum system represented 
by a two-dimensional Hilbert space with basis states conventionally denoted 
as |0) and |1). iV qubits are represented by the space spanned by all states 
\x) where x is an iV-bit binary string. The dimension of this space thus grows 
as 2". 

A general quantum system with classical configuration space Q may then 
be represented as an iV-qubit system by discretizing Q into 2 N elements. 
In order to simulate the system evolution in the quantum computer, this 
evolution must be discretized in time, and each resulting unitary step must 
be translated into operations to be performed on the qubits. In particular if 
the classical configuration space Q = (0, 1)®^ for N distinguishable partides 
in ld is discretized such that the i-th coordinate can be represented by a 
bit string Xi (containing, say, S bits), we can write the infinitesimal system 
evolution operator as 

U(t, t + At)= e- iHAt « e- iTAt e- [VAt 

where the kinetic term matrix elements (xí\T\xj) vanish unless i = j. Here, 
we are taking the state of the computer to comprise a superposition of terms 

\xi) ■ ■ ■ \x N ) 

where the length S quantum register \xí) encodes the instantaneous value 
of the i-th coordinate. If we assume only pairwise interactions, then the 



1 Sometimes "qubit" will be further abbreviated as just "bit" when the context is not 
one comparing quantum and classical information. 
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potential operator may be written 

V = \ X i, X j)( x h X j\ V \ x k, x l){ x k, x l\ ■ 

ijkl 

In order to let e~ lVAt ~ 1 — iVAt act on our registers, we thus need to 
perform a quantum computation in which arbitrary pairs of registers are in- 
teracted and, based on the initial vàlues of those registers, transformed into a 
superposition whose coefficients are determined by the (x i} Xj\V\xk,xi) terms 
above. Moreover, the vàlues of these terms must themselves be computed 
through arithmetical operations in the quantum computer, for instance if 
they are given analytically as polynomials in xi — Xk and Xj — x\. 

A given arithmetic operation between two registers will be accomplished 
by a sequence of operations carried out on the physical qubits constituting 
these registers, either individually or in pairs (or possibly in higher order 
combinations). This is the bàsic idea of a quantum gate. The essential differ- 
ence between a quantum and a classical gate is simply that in the quantum 
case, a register consisting of a single term, e.g. |01), may be taken into a reg- 
ister consisting of a superposition of terms, e.g. 1 00) + |10). More generally, 
a quantum gate is just a unitary transformation applied to the qubits it acts 
on. 

In order to apply the gate corresponding to the unitary operator e~ lVAt 
in this case we must simply allow it to act on every combination of two coor- 
dinate registers \xí) and \xj) — with each such action reduced to a sequence 
of bit-wise gates on the individual qubits constituting the registers. This 
amounts to 0(N 2 ) operations. Propagating from t = out to t = T will 
thus require 0(N 2 T/ At) operations, clearly scaling polynomially in the sys- 
tem size N. (See [3] for a more detailed treatment of quantum computers as 
simulators.) 

1.2 Steering a Quantum Computer 

A classical computer, because its gates take a single bit-string only into 
another single bit-string, will need to perform a separate computation for 
every term in a superposition like 52i «i \x^ ■ ■ ■ Xi N ) if it is to simulate a 
genèric evolution for the above quantum system. This implies a number of 
computations scaling exponentially in N. On the other hand the quantum 
computer is in effect carrying out all these operations in parallel. More 
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generally, consider an N bit string x, and a function f(x) outputing a single 
bit. A single classical computation of / might yield something like: 

x = 0101001 •••01 -> f{x) = 1 

whereas, if the function / can be realized as a unitary transformation U 
so that U\x)\Q) = \x)\f(x)), then a quantum computer could perform the 
following as single function call: 

5>>|o>-$>> |/(*)> 

X X 

where the action of U has been distributed over all terms in the sum by 
virtue of its linearity. This "massive parallelism" is the essence of the expo- 
nential speed-up realized by a quantum computer in the problem of quantum 
simulation (and likewise in Shor's factoring algorithm 

As stated this parallelism seems too good to be true. It seems we can 
trivially crack NP-complete problems (e.g. optimization problems like the 
traveling salesman problem) by simply encoding all possible solutions in a 
quantum register (e.g. J2 X \x)) and devising some U to perform computations 
on all these possible solutions in parallel. Indeed this is too good to be true, 
for the necessity of reading out the computer at the end of the computation 
takes on a new significance in the quantum domain. Reading out entails 
measuring the register, which in this case, will collapse the register into 
the term corresponding to a single one of the candidate solutions \x). This 
irreversible process has destroyed almost all of the information encoded in 
our quantum state. 

Indeed in the above sketch of a quantum simulation algorithm we cannot 
read out the entire state vector at t — T. We would expect to define some 
Hermitian operator of interest, and effectively measure that operator alone. 
The massive parallelism is only half the battle — this second aspect, being 
clever about what to measure at the end, is equally important. Another way 
of looking at this is not as some special measurement to be made at t = T, but 
rather a sequence of additional operations to be performed on the encoded 
quantum information that will allow us to extract the desired properties of 
our final state through more standardized (single qubit) measurements. It is 
the task of quantum algorithms to devise operations that can make use of 
this kind of massive but restricted parallelism. 

An essential issue in designing such algorithms is the question of what 
primitive operations are assumed available to act on qubits. Without any 
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constraints here, we could simply define some final state that encodes the so- 
lution to a problem of interest, assert that such a state is reachable through 
some unitary transformation from the initial state 1 00 ■ ■ - 0), and declare vic- 
tory. Rather, what we would like is a small set of primitive unitary operations 
involving a limited number of qubits at one time, such that when applied in 
combination they can produce any desired unitary (with arbitrary accuracy) 
on the space of all the qubits jointly. Such a set of primitive operations is 
called a universal gate set, and indeed the first fundamental results in the 
field concerned the construction of such a set. 

The problem of finding a compact universal gate set and demonstrating 
its universality — i.e. the ability to generate arbitrary unitàries over a Hilbert 
space with arbitrarily many qubits — is closely related to the problem of the 
controlability of a continuous quantum system whose evolution is given by 
some Hamiltonian 

H = H int + Y J <t)H l . 

i 

Here H int corresponds to the infernal dynamics of the system, and the Hi cor- 
respond to controllable external interactions imposed by the experimenter, 
so that the couplings serve as tunable control knobs. Such a system is 
referred to as controllable if by suitable tuning of the Oj(í), any state \i])(T)) 
may be reached from some fixed initial state \if)(0)), possibly subject to cer- 
tain constraints imposed on the magnitude of the and their derivatives 
for t G (0,T). 

The essential resource we have in trying to produce motions in the Hilbert 
space which are not generated simply by a single Hi is the possibility of 
composing such motions as the following: 

ç-ïHiAtç-iHjAtçiHiAtçiHjAt 

which to lowest order results in a motion generated by the commutator 
[Hi,Hj]. Moreover, given these Hi, we can generate all possible iterated 
commutators, and the question becomes whether the àlgebra of all these 
commutators exhausts the space of all Hermitian matrices. If so, then the 
set {Hi} is universal over all possible Hermitian generators [S], directly anal- 
ogous to a universal gate set. The difference is principally just that our 
control knobs here may be adjusted continuously in time, while in the quan- 
tum computing context, the control knobs are simply the choices of which 
gates to apply in what order and are therefore discrete in nature. 
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We will be interested in problems of continuous quantum control later. 
Now it will sufíice just to recognize the possibility of universal gate sets, 
which constitute our most elementary tool box in the design of quantum 
algorithms. For example, it is well known [ü] that one universal gate set can 
be built from just two gates: (i) a genèric single qubit rotation, e.g. e l9X ^ 2 , 
where 9 is an irrational múltiple of tt, and (ii) the controlled-not (C-NOT) 
gate, which acts on two qubits. 2 We can fully specify this C-NOT gate by 
its action on the basis elements in a two-qubit Hilbert space: 

C-NOT 

|00) -»■ |00) 

|01) -> |01) 

|10) -> |11) 

|11) -> |10) 

where we say that the C-NOT is taken from the first qubit (the "control" 
qubit) to the second one (the "target" qubit). The action of this gate on a 
general two-qubit state is then implied by the fact that the gate is a linear 
operation. 

It should be noted that the one-qubit rotation combined with the C- 
NOT may be used, first, to generate another one-qubit rotation that does 
not commute with the original one. These two can then be used to generate 
all possible one-qubit rotations j5j. The significance of the requirement that 
the original one-qubit rotation involve an angle 9 that is an irrational múltiple 
of tt is that, otherwise, the commutator àlgebra associated with these two 
rotations would close on itself before exploring the entire SU(2) of qubit 
rotations — yielding only a finite number of reachable rotations. 

Some such gate as the C-NOT is crucial to exploit the massive parallelism 
of a quantum computer because it is an entangling operation, i.e. it has the 
power to take two initially unentangled qubits and produce a final state in 
which they are entangled, for instance: 

|00) + |10) -> |00) + |11). 

Whereas the initial state may be factored, hence has no entanglement, the 
final state cannot. Obviously massive parallelism in a quantum computer 

2 The notation X, Y, and Z is commonly used for the Pauli operators a x , a y , and <j z . 
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would require more than just two-qubit entanglement. The idea in using 
the universal gate set defined above is that more complicated entanglement s 
among many qubits can be built up by successive applications of the C-NOT 
gate to different pair of qubits. 

1.3 Unitary and Decoherent Errors 

At this point we have laid out one very significant application of a quantum 
computer (simulating quantum systems) and the bàsic capabilities that must 
be realized in any physical implementation of such a device. In principle, then 
it seems like quantum engineers would now be at a place analogous to that of 
Mauchly and Eckert on the eve of the ENIAC project. Obviously formidable 
technical problems remain, but no bàsic theoretical obstacles seem to remain. 

Except for one thing, which was probably more of a nuisance than a 
serious obstacle for Mauchly and Eckert: the problem of error correction. 
Due to the imperfection of components and action of gates, etc, there will 
always be errors creeping into any computation. For a classical computer, 
there will always be the occasional that is accidentally flipped to a 1 and 
vice versa. Uncontrolled, these errors will tend to build up and totally corrupt 
the computation after an amount of time depending on the computer's error 
rates. 

It was von Neumann who first exhibited a straightforward method to 
handle this problem. If we want a particular bit to store a 1, for example, we 
should actually use not one physical bit for the job but a couple, say N. We 
will simply set all of these bits to 1. After a little time has elapsed, some of 
these bits may have accidentally flipped to 0. We can combat this tendency 
simply by checking the bit vàlues and majority voting in order to flip the 
erroneous bits back. (Note: here we cannot just flip all the bits back to 1 
irrespective of the how many flipped, because that would require us to have 
stored a separate record of what the correct value of the bit was, namely 1.) 

If the probability of any individual bit having flipped over this time is 
no more than some bound p, then as N gets large the chances that one 
half or more of these bits have accidentally flipped — hence the probability 
that our majority vote will fail to restore the proper bit state — goes down 
exponentially with N. This is the essential property of an error correcting 
code: exponential security, or, in other words, arbitrarily good security with 
only a logarithmic overhead of additional bits. 



CHAPTER1. ELEMENTS OF QUANTUM INFORMATION 9 



Constructing this particular error correcting code, that is the assignment 
of iV physical bits to encode one "logical" bit by pure repitition, was quite 
simple. In fact much more sophisticated codes exist for classical computers; 
however, they do not differ in the general nature of this scaling between the 
overhead required to implement the code (number of additional bits) and the 
security provided by the code (probability of failure). 

Unfortunately, the problem of error correction is qualitatively harder in 
the quantum domain. In fact, from the advent of quantum algorithms in 
the 1980's to the first proposal for a viable quantum error correcting code in 
1995 there were serious doubts about even the theoretical possibility of such 
a code 

There are two essential obstacles unique to maintaining the integrity of 
quantum information: decoherence and state reduction. Decoherence refers 
to the tendency of quantum systems to become increasingly entangled with 
their environments so that the interference effects necessary for quantum 
algorithms become gradually less pronounced, until they are finally reduced 
to unmeasurability. Loosely, decoherence is what turns a quantum system 
into a classical one, hence one incapable of allowing us to cash in on massive 
parallelism. 

Suppose, for instance, we are interested in measuring a physical effect 
associated with some quantum mechanical phase (e.g. an Aharanov-Bohm 
effect) in a two-level system. Let us first describe the process in the absence 
of any decoherence. Thus we might start with our system in the unnormalized 
state |0) + |1), and then enact the phase-generating process, which may be 
described by some unitary evolution 

|0) + |l)^|0) + e 2i *|l). 

In order to measure 0, we might perform a simple quantum gate on this 
single qubit, which is to apply a Hadamard rotation defined by the basis 
state transformations (neglecting normalization): 

Hadamard 
|0> - |0) + |1) 

|i> - |o)-|i). 

This gives 

|0) + e 2i ^|l) ^cos0|O) -isin0|l) 
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which allows us to transform the phase information into amplitude informa- 
tion through a process of quantum interference. We may now simply measure 
Z, i.e. the bit-value of this qubit. (Such a measurement, as opposed to a 
direct measurement of phase, is what we generally assume is available to us.) 
Repeated trials will be able to determine with high accuracy. 

We can now consider the effect of decoherence by explicitly accounting 
for the quantum state of the environment surrounding our system. Suppose 
the environment starts in some state \E), unentangled with one qubit, so 
that we have the total initial state 

(|o> + |i»|£>. 

Suppose as well that during the phase-generating process, some interaction 
between our qubit and the environment causes the latter to either respond 
(\E) — ► \E')) or not respond (\E) — > \E)) depending on the state of the 
qubit, i.e. 

(|0> + |l>)|£>^|0>|£> + e^|l>|£'>. 
After applying a Hadamard rotation, this becomes 

|0)(|£) + e 2i< *|£')) + \1)(\E} - e 2i(t> \E')) . 

Now, assuming the effect on the environment is eventually amplified enough 
that (E\E') = 0, measuring the qubit will have equal probability of yielding 
|0) or |1) independent of 0, hence it will reveal exactly nothing about 0. 

In other words, the simple quantum computation that in the absence of 
decoherence allowed us to extract phase information from our quantum state, 
has been rendered useless in the presence of decoherence — more precisely, in 
the presence of total decoherence, since we assumed (E\E') = 0. In this 
case we say that the qubit has totally decohered in the basis {|0), |1)}. If 
(E\E') 7^ 0, the qubit will have decohered only partially in this basis, and as 
(E\E') approaches zero, the number of measurements necessary to extract 
with a given accuracy will approach infinity. 

The crux of the decoherence problem in regard to storing and using quan- 
tum information lies in the ubiquity of what we classify as "environment," 
that is: everything not part of the finely controlled system which constitutes 
the quantum computer itself. Everything from ambient electromagnètic field 
modes (whether occupied or unoccupied), to air molècules capable of scat- 
tering off components of the computer, to the àtoms constituting a substrate 
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for these components, to other neglected degrees of freedom within the com- 
ponents themselves — all are environment. Because of this environmental 
ubiquity, each individual qubit will face an independent array of possibilities 
for it to decohere. Loosely, each qubit will have a constant (or bounded 
from below by a constant) probability of decohering within a given time in- 
terval, independent of the other qubits. Therefore, the probability that the 
unaided computer will maintain any fixed degree of coherence goes down 
exponentially with the number of qubits for sufficiently complicated (i.e. 
time-intensive) computations. 

1.4 Quantum Error Correcting Codes 

So far, the exact same comments may be made for a classical computer 
whose bits each independently face a fixed probability of being accidentally 
flipped. The problem for a quantum computer is that the bàsic method of 
error correction, the repitition code, fails trivially for a system of qubits. 

The quantum analog of the repetition code is easy to construct. Encode 
one logical qubit in N physical qubits by the assignment 



Suppose now we have encoded the logical state a|0) +/?|1) in n qubits. After 
some time has past, so me of these n qubits will have suffered errors, for 
example having their bit vàlues flipped: 



(we have taken n = 5 to illustrate bit flips on the third and fifth bit.) This 
is a unitary error — errors associated with decoherence processes might also 
occur. But neglecting decoherence for the moment, we would then like to 
measure all the qubits and somehow majority vote to determine how we 
should restore our state back to its original (pre-error) form. The problem is 
that even measuring only a single qubit collapses the state to a single com- 
putational basis element, here either |00101) or 1 11010). We may majority 
vote and correctly reconstruct, e.g., 1 11010) into 1 11111) ; however, we have 
eliminated all the quantum information comprising a and f3 in our original 




physical 

|000···0) 

|111···1) 



a|00000) + /?|11111> -> a|00101) + /3|11010) 
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state. Indeed we have learned something about these parameters by the 
probabilistic nature of this state reduction, but we have transformed some 
of the quantum information into classical information and simply destroyed 
the rest. 

A simple strategy to overcome this unfortunate state reduction is to mea- 
sure not each individual bit value — i.e. measure Zj on each qubit i — but to 
measure only the bit vàlues of each bit relative to its neighbors, which means 
measuring the product ZjZj + i where i + 1 is taken modulo n. Since both 
terms 1 11010) and 1 11010) are eigenstates with the same eigenvalue for any 
one of these measurements, we will not have collapsed our state. And a 
majority vote will allow us to restore the original error-free state. But how 
exactly are we to measure this product operator ZjZ í+1 ? 

Assuming we are able to measure single qubit operators like Zj itself, such 
product operators may be measured by simple procedures involving ancilla 
qubits — qubits that are used just as a temporary scratch pad and whose final 
state is not important for the overall computation. To measure ZiZ i+ i, we 
initialize one ancilla qubit a in the state |0), perform a C-NOT from qubit i 
to a, another C-NOT from i + 1 to a, and then measure Z a . These operations 
ensure that any eigenstate of ZiZ i+ i will, with the addition of the ancilla a, 
also be an eigenstate of Z a , with the same eigenvalue. So measuring Z a is 
equivalent, in terms of the information revealed and the effect on the total 
state, to measuring ZjZ í+1 itself. 

Thus we can overcome what seems the bàsic obstacle of revealing too 
much information when we make measurements necessary to correct errors 
in our state. Still, a uniquely quantum problem remains. We have only 
discussed the kind of error (whether decoherent or unitary) that applies to 
the bit-value of the state, as opposed to its phase. (In the decoherent case 
this relates to the basis in which we assume decoherence.) Suppose in the 
above error process, along with bit flip errors, our qubits can undergo phase 
errors: |0) — > |0) but |1) — > — 11). In particular suppose the first qubit i = 1 
alone suffers such a phase error. The above parity measurements will not 
reveal any information about this error, and we will end up not with the 
original error-free state, but with the logical state a\0) — /3\1). Thus a phase 
error occurring in even a single qubit is enough to undermine this kind of 
bit-parity code. 

A dual kind of code can be constructed that does exactly the opposite: it 
corrects phase errors but not bit errors. This phase-parity code is identical 
to the bit-parity code if we just prepare our state by applying a Hadamard 
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rotation to each qubit, hence the encoding is 

|o> - (|0> + |1))(|0> + |1»...(|0> + |1>) 
|i) - (|0>-|1»(|0>-|1»...(|0>-|1>). 

The effect of this is to change a physical bit error into a physical phase error 
and vice versa, which is how we can use the repetition idea above to fix phase 
errors. With this encoding the measurement procedure prior to our majority 
vote involves measuring XiX i+ i operators, not ZiZ i+í operators. 

Realizing the simple kind of duality between bit and phase errors illus- 
trated by the construction of this code, Peter Shor first proposed a fully 
quantum error correcting code in that it corrects both bit and phase errors 
Here, one logical qubit is encoded in nine physical qubits as follows 

|0) -> (|000) + |111))(|000) + |111))(|000) + |111)) 
|1) -»■ (|000> — |111>)(|000> — |111>)(|000> — |in>) . 

The measurement procedure is to measure for neighboring bits among 

the first three qubits taken as a set, and likewise for the second three, and 
then for the final three — the corresponding results are part of what is called 
the error syndrome. If a single bit among the nine had flipped, this syndrome 
will reveal which one and allow us to correct that error. To complete the 
syndrome we measure the operators XiX 2 X 3 and X 4 X 5 X 6 and X 7 X 8 X 9 , 
which is analogous to the Xi measurements in the pure phase code above. If 
a single phase error occurs, say to qubit 6, this will show up as a —1 result 
when we measure X^X^X^, indicating that we must choose either X 4 , X 5 , or 
Xq and apply it to our state. This corrects the phase error and returns us to 
our original error-free state. 

In the above, we have been somewhat cavalier about the kinds of errors 
that might occur in our computer. Even if errors are unitary in nature and 
not decoherent, it is unlikely that they will be either pure phase or pure bit 
errors. However any arbitrary unitary error for n qubits may be expressed 
as a direct sum of terms consisting of the identity and products of bit and 
phase errors. After such an error occurs, one sees the measurements specified 
above have the affect of collapsing the error itself into a set of pure phase 
and pure bit errors. 

A similar phenomenon occurs in regard to errors resulting from deco- 
herence. The act of measuring our qubit system has the fortunate effect of 
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actually unentangling it from the environment and transforming decoherence 
errors into a set of unitary (and in fact pure bit/phase) errors ^2]- If suffi- 
ciently many errors occur (e.g. more than one bit and one phase error in the 
9-qubit code above), however, the result may be that our recovery operations 
end up transforming our logical state a\0) + (3\1) into either a|l) + /3\0) or 
a|0) — We thus snowball errors to the physical qubits into a full-blown 
error to the encoded quantum information. Recovery has failed. 

This possibility is dealt with by constructing more sophisticated quantum 
codes, which permit more and more physical errors before a logical error is 
precipitated. In fact the pattern of Shor's 9-qubit code plainly suggests a 
recursive generalization. This code can be thought of as possessing not just 
two levels of qubits — the physical and the logical — but three levels. We 
have the physical qubits, then we have intermediate qubits encoded under 
|0) — > 1 000) and |1) — > 1 1 1 1) , and finally we have the logical qubits encoded 
under |0) — > (|0) + |1)) 3 and |1) — ► (|0) — |1)) 3 , where the qubits used for this 
last encoding are themselves not physical bits but the intermediate bits. We 
might just as well iterate this and employ 4,5, ... ,L levels of encoding. Here 
we have suggested alternating between bit and phase codes in this iteration; 
however, we can also regard this as iterating a single code, in this case the 
9-qubit code. In fact other codes, e.g. involving five or seven qubits, that 
protect against bit and phase errors are likewise amenable to this kind of 
hierarchical scheme. 

Codes gener ated in such a way are called concatenated codes, and consti- 
tute the first systematic, scalable means of quantum error correction 18J. 

The motivation for constructing such large codes is simply to leverage 
the trade-off between computational overhead (number of physical bits per 
logical bit) and informational security. Traditionally, security was measured 
in terms of how many errors to the physical qubits it would take in order 
that error correction procedures would turn out to corrupt the logical qubits. 
In the 9-qubit code, it takes in fact three errors. This code is thus said to 
have "distance" three. The overhead/security trade-off may be measured for 
a given code as a ratio of its distance to its block size — the number physical 
qubits necessary to form a block that encodes one logical qubit. 

As we increase the block size n, we might hope that code distance d has 
a linear asymptotic scaling with n. If through each round of error correction 
each qubit will have some error probability p (or even bounded above by p), 
we would be satisfied if p < d/n, for then the chances of recovery failure 
would fall exponentially with n, making it relatively painless to achieve very 
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high accuracy. 

However it is by no means necessary that p < d/n in order to obtain such 
an exponential scaling. In particular it is possible to imagines codes in which 
d/n — > as n — > oo but that are still exponentially secure in this limit. It 
may be that although it is possible that only d errors cause recovery failure, 
this becomes highly unlikely for large n, unlikely enough to overwhelm the 
high probability of merely realizing d errors in the first place. 

Let us examine this issue in the case of concatenating the 9-qubit code 
through L levels, and obtain a rough estimate of the chances of recovery 
failure assuming both bit and phase errors have an independent probability 
p per round of error correction. Consider the first level of the code, where we 
are dealing with physical qubits. The probability of two bit errors occurring 
(on two separate qubits) is then about 9p ■ 8p = 72p 2 , and likewise for two 
phase errors. For small p, the probability of either of these two cases is then 
about twice that, 144p 2 . Thus p — > 144p 2 is the mapping from the error rate 
at level 1 to that at level 2. At level 3, our "physical" qubits are actually the 
logical qubits at level 2, which have error rate 144p 2 , so the logical qubits at 
level 3 will fail with probability 144(144p 2 ) 2 . Iterating this calculation, gives 
that the overall failure rate F for the code up to L levels is 

F = U4 2L - 1+2L - 2+ - +1 p 2L = (lUp f = (144j9)" ' 315 , 

where we have expressed 2 L in terms of the block size n = 9 L , with log 9 2 w 
0.315. F thus has the desired property that it dies exponentially in n (rather 
a power of n) if the physical error rate p < 1/144. This is a very simple 
kind of threshold result. The crude estimate p c = 1/144 for the physical 
error rate threshold, or the "critical" error rate, would therefore serve as a 
benchmark for evaluating the viability of a given physical implementation of 
the concatenated 9-qubit code. 

In the above we have glossed over one very important point. By taking 
the error rate at level l as simply the failure rate at level l — 1, we have 
assumed that error correction at l will be just as easy as error correction at 
l — l. This is obviously not true. There are 9 times as many qubits in a block 
at l than at l — l. We will therefore require many more recovery operations 
and measurements at the higher level. And crucially: the recovery operations 
and measurements that we employ to correct errors are themselves liable to 
cause additional errors in the computer. Our own operations are faulty, and 
our codes must be designed to take this into account. Moreover, because we 
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must make joint measurements on múltiple qubits, e.g. measuring X1X2X3 
in the 9-qubit code itself, there is the possibility of an error in one qubit con- 
taminating other qubits in the block. In designing our recovery procedures, 
we must be sure that the tendency for our own actions to spread errors does 
not overbalance the error correction achieved through those procedures. 

Mathematically, this means that the error rate mapping between levels 
/ — 1 and l, which we had taken as 

P1-1 ->pi= 144p^_! , 

will now explicitly involve l in a more complicated manner. Determining 
this l dependence requires an analysis of how errors are both generated and 
spread by our recovery operations. Systematic calculations have been per- 
formed in this manner showing that fault-tolerant recovery is possible with a 
general class of concatenated codes, and better estimates of the critical error 
rate(s) are on the order of p c ~ 10 -4 as well as comparable thresholds for the 
accuracy of the (physical qubit) gates [T8] . 

What we have addressed so far is only the problem of storing quantum 
information with these codes. Additional qüestions arise when we want to 
also perform quantum gates and make measurements of the logical qubits 
encoded in such blocks. In other words, we need to determine what sequence 
of operations to perform on the physical bits themselves that will result in 
the application of a desired operation to the logical qubit (s). For instance, 
if we want to perform a bit flip (X gate) on a logical qubit stored with 
the 9-qubit code, we need to act on the state with operators that flip the 
three relevant phases, e.g. with the operators Z\, Z4, and Z7. The physical 
operation Z1Z4Z7 is therefore equivalent to the logical X operation. Similar 
correspondences have to be found for all members of a desired universal gate 
set in order to allow for universal computation on the encoded quantum 
information. Given a specification of some gate on the encoded information 
in terms of a sequence of gates on physical qubits, we must then analyze 
the propensity for errors to build up as a result. This will lead to similar 
threshold results corresponding to these physical gates [IB]. For example, the 
accuracy of a physical C-NOT gate will have to be below a certain critical 
value in order that using it in the performance of a specified logical gate will 
(with high probability) not spark a cascade of physical qubit errors that may 
damage the encoded information. 

Both the problem of storing and of performing long computations with 
quantum information have been essentially solved on the theoretical level 
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with these concatenated codes. However, the solution is not unique, and it 
seems new solutions will be necessary in order to bridge the gap between, 
on the one hand, the assumptions entering into present estimates of the 
performance of concatenated codes and, on the other hand, the forseeable 
experimental frontier in quantum information. 

Although it has not been emphasized above, one important assumption 
is that physical qubits separated by large distances in the computer may be 
gated together efficiently For instance, measuring an operator like Z\Z^Z-j at 
the highest level of a quadruply concatenated 9-qubit code (i.e. one with L = 
4) will require gating pairs of qubits separated typically by on order of 100 
other qubits — even if we assume qubits distributed over a two-dimensional 
lattice. Clearly this scenario poses a daunting experimental problem. Gating 
qubits will always require some kind of well controlled physical interaction to 
take place between them, and the experimental possibility of achieving this 
pair-wise over a very large array is highly restricted. 

One strategy might seek to confront this experimental difficulty head-on, 
by using specially designed physical environments in which quantum infor- 
mation may be exchanged over large distances, for example by coupling two 
quantum dots (qubits) through a very high finesse QED cavity mode [Jj. 
Another strategy would seek to confront the problem first at the quantum 
software level, that is: to design quantum error correcting codes which min- 
imize the necessity of long-distance interactions between qubits. Such is the 
goal of an alternative paradigm for error correction invented by Alexei Ki- 
taev, which develops a connection between the idea of the stability to errors 
in a quantum code and that of the stability to deformations in the topology 
of a 2-dimensional surface. 
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Chapter 2 

Topological Quantum Memory 



2.1 Lattice Codes 

The bàsic idea of fault-tolerance is to store information in such a way that 
any little errors occurring in the computer's components cannot do serious 
damage. But this is not a new idea; topological information has this same 
character, being invariant to local deformations of a specified geometry. To 
exploit this analogy, one must find a way to view a block of qubits as a 
geometrical object with the information encoded in the block corresponding 
to some kind of topological invariant. 

This is the idea behind Kitaev's framework for quantum error-correcting 
codes [inj [22], which organize qubits as edges in a 2d lattice on a toms. 
Kitaev found codes for which certain crucial recovery operations (syndrome 
measurements) are all local on the lattice, never involving more than a few 
neighboring qubits. Thus errors are severely limited in their propagation 
without the necessity of complicated fault-tolerant gate constructions — fault- 
tolerance is introduced at a more fundamental level. Moreover, fatal error 
processes are seen to arise only in the aftermath of large scale topological 
breakdowns in a recovery algorithm to be specified. 

The toric code TOR(/c) uses 2k 2 physical bits arranged in a k x k lattice 
(with edges identified) to encode two logical bits. Its stabilizer — i.e. the 
group of all transformations that do not affect the encoded information — is 
generated by star and plaquette operators 
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Figure 2.1: A plaquette operator B, comprising four Z's, and a star operator A, 
comprising four X's. 

respectively, where denotes the four edges emanating from vèrtex s and 
"□p" denotes the four edges enclosing face p (see Fig. l2.ljl . The code subspace 
is that fixed by A s and B p for all s and p. Note that any A s shares either zero 
or two edges with any B p , so all the stabilizer operators commute. Because 
of the two operator identities Yl s A s = 1 and Yl p B p = 1, only 2k 2 — 2 of the 
stabilizer generators are independent, giving 2k 2 — (2k 2 — 2) = 2 encoded 
qubits, i.e., a 4-dimensional code subspace. The connection to topology 
arises from the fact that the plaquette operators generate exactly the set 
of contractible loops of Z's on the lattice (see Fig. 12.2 jl . Likewise, the star 
operators generate exactly the the set of contractible loops of X's on the 
dual lattice (the lattice obtained by rotating every edge by 90° about its 
midpoint). 

To see how this is reflected in the assignment of logical basis elements 
("codewords"), let us find them explicitly. Consider the (unnormalized) state 

|oo) = JJ(i + A s )|o···o) = I i + J2 A s + Y. A ° As ' + ---) I '" ) 

s \ s s<s' / 

where 1 - - - 0) refers to all the 2k 2 physical bits and the barred bit- vàlues 
indicate logical qubits. Any B p applied to this state commutes through all 
1 + A s factors and leaves 1 - - - 0) fixed, so 1 00) is a +1 eigenstate of all 
the plaquette operators. |00) is also fixed by each star operator because, A r 
commutes through all the l+v4 s factors until it finds 1+A r , and A r (l+A r ) = 
1 + A r . Thus 1 00) can be taken as a codeword. To see the topological nature 
of this state expand the product as above. Each term in the sum represents 
a pattern of contractible co-loops (loops on the dual lattice) and the sum 
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Figure 2.2: A contractible loop (a), and a non-contractible loop (b) on the lattice. 



is equally weighted over all such patterns. In this sense, all "geometries" 
are summed over, leaving only topological information as far as error chains 
are concerned. Operating on 1 00) with any contractible co-loop of X's just 
permutes terms of the sum, leaving the state unchanged. 

More generally, any loop of Z's or co-loop of X's, contractible or not, com- 
mutes with all the stabilizer operators. If the loop or co-loop is contractible 
it fixes all codewords, but if non-contractible it non-trivially transforms the 
code subspace. In fact we can take Xi|00), X 2 |00), and XXX2I00) as the three 
remaining codewords, where Xj = Yl c . X is given by a non-contractible co- 
loop of X's running across the lattice horizontally along the path c x \ or 
vertically along c X 2 (see Fig. l2.3|) . (Here the index i refers to lofical not phys- 
ical qubits.) Thus Xi and X 2 act as the logical X's for bits 1 and 2. The 
logical Z's are given by Zj = T\ c a non-contractible loop of Z's running 
horizontally (i = 2) or vertically (i = 1) across the lattice. Note that c x i 
and c Z i run in perpendicular directions so that Xj and Zj anticommute. Also 
note that these constructions only depend on topology: the paths defining 
any of these operators may be "continuously" deformed without affecting 
their action on the code subspace since any such deformation corresponds to 
applying a contractible loop operator, which fixes all codewords. 

Suppose we have a state in the code subspace and apply an open co- 
chain of X's along some co-path P between faces q and p. This changes the 
quantum numbers for B q and B p from +1 to —1, generating "partides" at q 
and p. Now the sum-over-geometries is such that the resulting state would 
be exactly the same if we had used not P but some P' which is obtained 
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Figure 2.3: The logical X and Z operators for qubits 1 and 2. Each logical 
Z is given by a non-contractible loop of physical Z's, and each logical X by a 
non-contractible co-loop of physical X's. 

by "continuously" deforming P with its endpoints fixed. Information about 
which of the topologically equivalent co-paths is taken washes away in the 
superposition because P and P' differ only by a contractible co-loop of X's, 
which belongs to the stabilizer. Likewise, applying a chain of Z's between 
vertices r and s generates a dual kind of particle at r and one at s, with the 
same topological character. Given a lattice state we can measure all the star 
and plaquette operators to obtain a syndrome which just lists the locations 
of all the partides present on the lattice. To correct the errors indicated by 
the presence of the star (plaquette) partides we must group all the partides 
in pairs, connect each pair with a chain (co-chain) of our own, and apply 
Z (X) operators to the qubits along these chains (co-chains). Leaving aside 
the possibility of measurement errors, which will be addressed below, this 
transforms an arbitrary pattern of errors into a number of closed loops on 
the (dual) lattice (see Fig. 12.4)1 . What we want is that all these closed loops 
be contractible so that the logical qubits are left undisturbed. If one of the 
loops is non-contractible we will have unwittingly applied one of the X{ or 
Zi operators to our state, causing an error in the encoded information. 

In principle, it only takes k/2 errors lying along one non-contractible (NC) 
loop to undermine TOR(fc) irrespective of our particle pairing algorithm. But 
it would be exponentially improbable as k gets large, that if just k/2 errors 
occur they would be positioned in just the right way to do this. In general, 
measuring all the star and plaquette operators will collapse the lattice state 
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into a superposition of codewords all acted on by a definite set of single 
qubit phase (Z) and bit flip (X) errors. If decoherence/error processes act 
independently on separate qubits, and in a relatively uniform way, they will 
give rise to a certain probability, p z , for each qubit to undergo a phase error, 
and perhaps a different probability, p x , to undergo a bit error. Depending on 
p x and p z and on what algorithm we use to pair partides, there will be some 
probability that we are tricked into generating an NC loop when we think we 
have merely corrected errors. If this happens our state is corrupted, but we 
will see that such a recovery failure can be made exponentially improbable 
as k increases, a result reminiscent of concatenated codes. 

2.2 Repetition Code as a ld Lattice Code 

For practice and later reference let us examine the ld equivalent of TOR(fc), 
which uses a circle of k qubits instead of a k x k toric lattice. The plaquette 
operators do not exist here, and the star operator associated with vèrtex 
s becomes the product of X's over the two qubits touching s. In its own 
right this code, which is dual to repitition code discussed in Chapter [TJ is 
worthless because a single bit flip error causes a logical bit flip error. But 
understanding the statistics of z-error chains will prové useful for analysis of 
TOR(fc). 

Suppose our lattice code state is picked from an ensemble in which each 
physical qubit suffers a z-error with probability p, independent of all the 
other qubits. For example, we might have n = (0,1,1,0,0,0,1,0), which 
describes the ld lattice (with ends identified) 

— <— »• <— »• <— > — 

where indicates a z-error. Measuring the syndrome, we determine the 
locations of all z-error chain endpoints, in this case 



We must now guess which endpoints are connected to which others and apply 
our own recovery chain of Z's between each pair of "connected" endpoints 
to cancel the errors. In ld there are only two possible guesses corresponding 
to two complementary patterns of errors on the circle. So if we guess wrong 
the combination of errors and recovery chains will encircle the lattice, giving 
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n = (1, . . . , 1) hence a logical phase error. Otherwise we will have successfully 
corrected all the errors, giving n = (0, . . . , 0). Assuming the error probability 
p is relatively small, the obvious algorithm for particle pairing would be to 
favor the minimum total length of recovery chains. (For a 2d analog of this 
minimum distance algorithm, see [2*2*].) This algorithm, however, is highly 
non-local on the lattice; consider instead the following quasi-local alternative. 
First pair all partides separated by only one edge; contested pairings may 
be resolved randomly. Then pair any remaining partides separated by two 
edges, etc, until all partides are accounted for. In ld this algorithm may 
produce a number of recovery chains which overlap hence cancel each other, 
always resulting in one of the two bàsic guesses. 

The failure probability F, here referring to the probability of causing a 
logical phase error, derives from the set of all possible error configurat ions 
which can trick the algorithm into forming an NC loop of z-errors. In par- 
ticular we have the bound 

oo 

F = (m) < k h i( n )P n (2- 1 ) 

n=n( k > 

where hi(n) is the number of different error chains that the algorithm can 
generate with a fixed number n of z-errors and starting from a fixed vèrtex. 
In other words, h\ (n) is the number of ways n z-errors can trick the algorithm 
into flipping all the bits inbetween instead of correcting the erroneous bits 
themselves. The lower limit in the sum is the fewest number of errors 
necessary to cause the algorithm to generate an NC loop on a circle of size k. 
The ensemble average (nA refers to an arbitrary component of n, evaluated 
after recovery chains have been applied. One might then expect F is a 
kind of order parameter describing the topological order of error chains on 
the lattice. We shall see that for p below a certain critical error rate p c , 
our recovery algorithm maintains the lattice in a highly stable phase where 
(rii) <C 1 so NC loops are very unlikely. In the thermodynamic limit k — > oo, 
(rij) = in this phase, but what we want to know is exactly how small (rij) 
is as a function of k. 

To study F let us first calculate nS k \ or equivalently calculate the max- 
imum length l(n) of an [n]-chain — that is, an error chain generated by our 
algorithm and containing n errors. Clearly 1(2) = 3, since two lone errors can 
be separated by at most one edge if they are to be paired by our algorithm. 
This makes the [2]-chain «-> — <-> where the middle link is a recovery chain. 
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Now if we take two of these [2]-chains and join them through the longest pos- 
sible recovery chain (itself 3 edges long), what we have is the longest possible 
[4]-chain. We can continue to build up maximal [2 L ]-chains in this highly 
symmetrical, Cantor set pattern, and we find l(2 L ) = 3 L . Generating an NC 
loop requires an error chain of length at least k/2, so if k/2 is a power of 3 
we have = (k/2) 13 where (3 = log 3 2 ps 0.6309. If k/2 is not a power of 3, 
the maximum chain will have to involve asymmetric joining processes, which 
serve only to decrease its length relative to the Cantor chain trend. Thus 
l(n) = n 1 ^ serves as an upper bound on chain length in general, but it will 
prové useful to have an explicit expression when n is inbetween powers of 2. 

Consider the sub-chain structure of the maximal [2 L — 2 M ]-chain. We 
may emulate the Cantor pattern by dividing the 2 L — 2 M errors into two 
identical [2 L ~ l — 2 M ~ 1 ]-chains and extending the longest possible recovery 
chain between them. Iterate the process for each of these two chains, etc, 
until we have reduced the lot into [2 L ~ M — l]-chains and can go no further. 
Now it is not hard to determine l(2 N — 1). As the 2^ — 1 errors join in 
successive levels, they look just like a Cantor chain, except at each level 
there is always one runt sub-chain shorter than the rest. At the first level, [1]- 
chains join in pairs to become [2]-chains (<-> — <->) , except one is left unpaired 
resulting in the runt [l]-chain at the second level. Now the [2]-chains join in 
pairs, except one joins the runt giving the runt [3]-chain (<-> — <-> — <->), etc. 
The number of edges lost at each level relative to the corresponding Cantor 
chain follows: 2 edges at the first level; another 2 edges at the second; 

and at an arbitrary level, a number of edges equal to the sum of all previous 
losses. Summing the series yields a total relative loss of exactly 2 N edges, so 
l(2 N - 1) = 1{2 N ) - 2 N = 3 N - 2 N . Taking all of our [2 L ~ M - l]-chains as 
units in one big Cantor pattern, one finds 

l(n) = (3 L - M - 2 L - M )3 M = 3 L - (l) M 2 L , (2.2) 

which is the desired expression for maximal chain length when n = 2 L — 2 M is 
inbetween powers of 2, giving the correct results for the limits M = 1, L — 1. 
Note l(n) < n 1 ^ with equality when n is a power of 2. 

To bound the chain counting function h\(n = 2 L ), consider all ways 
an [n]-chain can be decomposed into an [m]-chain S and an [n — m]-chain 
S' joined by a recovery chain R. Neither S nor S' can contain any recovery 
chains longer than R, which means that S cannot contain any recovery chains 
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longer than 5" itself and vice versa. Thus we can write 

n-l 

hi(n) < /?, 1 (m|m < )/i 1 (n — m|m<) • (w< + 1) (2-3) 



m=l 

where m < is the lesser of m and n — m, and "|m)" reads "given that there 
are no recovery chains longer than m 1 ^," which is the maximum length of 
an [m]-chain. The factor m< + 1 counts all possible recovery chains R, 
including the "0-chain." To bound the sum, let us find the maximum value 
of /ii(m|m<)/ii(n — m|m<) over all possible m or, without loss of generality, 
over 1 < m < n/2. In general we expect the number of different chains to 
increase with increasing chain length, so we should find the m which allows 
for the maximum possible summed length l$s' of S and 5", corresponding to 
the two h factors. For a given m we have 

lss'{ m ) — l{m) + l{n — m\m). 

We know l(m) from ()2.2j) . and we can calculate l(n— m\m) by finding the error 
configuration which saturates the "|m)" constraint. This is done by dividing 
the n—m errors into groups of m errors, arranging errors within each group in 
the Cantor form, and linking these groups together through recovery chains 
of maximum length. Together with (J2.2|) . and using l{m) = m 1///3 , this yields 

l ss ,(m) = 2 n ~ m i(m) = — \(n- m)n 1/f3 - nin - m) 1//3 l . (2.4) 
mm 

It is straight-forward to show this function is strictly increasing over 1 < 
m < n/2, so that the maximum is achieved at m = n/2, which choice should 
then also maximize /i 1 (m|m < )/i 1 (n — m|m < ). Using ()2.3|) and the fact that 
hi(m\m) = hi(m) we have 

n-l 

h(n) < E 1 (n)h 1 {n/2) 2 where Ei(n) = ^(m< //3 + 1). 
Iterating the bound yields 

oo 

hi{n)< hi(l) JJS!(2 L ) 2 ^ =(8.872...)™. (2.5) 

L=l 



m=l 
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with the aid of some numerical evaluation. We might have put h\(l) = 
2 but instead use h\(l) = 1 because ld error chains cannot double-back 
on themselves. (At a chain's starting point ^i(l) = 2 holds, but this has 
exponentially small effect for a long chain.) Now í)2.1|) implies a concise 
bound on the (phase error) failure probability for this ld algorithm: 



where the actual accuracy threshold p c is no less than 1/8.872. 

2.3 Recovery with Perfect Measurements 

We must now extend the algorithm to 2d (again assuming no measurement 
errors), so that something like ()2.6|) applies to both z-errors and x-errors. To 
simplify analysis we make no use of correlations between phase and bit fiïp 
errors, so x-error correction on the dual lattice is formally identical to z-error 
correction on the lattice, and only the latter is addressed below. 

"Two partides separated by a distance l on the lattice" means that the 
shortest path between them contains / edges. The locus of vertices equidis- 
tant from a given vèrtex looks like a diamond. So, given a particle s in the 
algorithm's í-th step, we need to search for partners over all vertices on a 
diamond of radius t centered on s. As the algorithm proceeds from t = 1, 
error chains close into loops and join with one another until no open chains 
are left (see Fig.l2~3|). 

A bound on the failure probability is obtained as before, but we must 
calculate a new chain counting function li2(n) since a given ld chain may 
wander across the 2d lattice along many different paths. Consider an [m]- 
chain with endpoints r and s, which is to join an [n — m]-chain with endpoints 
r' and s'. If the joining occurs through s and s', then s must be closer to 
s' than to r. So, taking s fixed, s' must be somewhere within the diamond 
centered on s and passing through r. This diamond has radius at most m 1 ^ 
hence contains at most 2m 1 ^(m 1 ^ + 1) + 1 vertices. Thus in 2d we have 



h 2 (n) < Z 2 (n)h 2 (n/2) 2 where E 2 (n) = 2mf (mf + 1) + 1 (2.7) 




(2.6) 



n-l 
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Figure 2.4: At the end of one recovery round, all open error chains have been 
transformed into closed loops on the lattice. Recovery is successfull if, as above, 
all these loops are contractible. 



and iteration gives 



h 2 (n) < 



oo 

L\2- L 



/ i2 (4) 1 / 4 ns 2 (2 



L=3 



(75.38... ) n 



with the aid of some numerical evaluation. Note we have halted iteration 
after reaching /i2(4) in order to improve the bound. We have bounded /i2(4) 
itself by using diagrams to count all possible [4]-chains. For example, 

(<-<-)(• + - + )(~<-0 = (4·3)(l + 3 + 7·H)(3·3) 

is the contribution to h 2 {^) from the joining of two [2]-chains, each length 
2. The numbers arise as follows: we start at some fixed vèrtex and have 4 
choices for positioning the first error, leaving 3 choices for the second error. 
Our recovery chain may have length 0, 1, or 2, giving a number of choices 
equal to 1, 3, or 7 respectively. Then we have 3 • 3 ways to position the 
next two errors. If the recovery chain is two edges long, however, there are 
4 • 3 ways to position these two errors, hence the factor of | above. The 
factor | arises from the fact that pairing ambiguities are resolved randomly 
by the recovery algorithm. If the recovery chain has length 2, there is only a 
1/4 chance that the [2]-chains will join as above (for this to happen, one of 
the two interior vertices must be chosen first for pairing, and then it must be 
paired with the other interior vèrtex) . We may compute three other diagrams 
allowing for either of the [2]-chains to have length 3, and we obtain the bound 
/i 2 (4) < 4997. (In counting arrangements of a length 3 chain we consider two 
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separate cases, namely when the endpoints are separated by one edge and 
by three edges.) 

The failure probability bound in 2d is thus 

oo , v (fe/2)' 3 

F < k 2 h 2 ( n )P n = k \ -) ( 2 - 8 ) 

n=(|)/ 3 

with critical probability p c > 1/75.38. This result applies equally to x-error 
and z-error correction. 

However, we shall see that this represents an over-estimate in regard to 
the exponent (k/2) 13 . The bound O was derived by assuming that all 
chains of (k/2) 13 or more errors generate an NC loop, hence result in failure. 
But, for instance, of all the chains with n = (k/2f errors only a few can 
generate an NC loop because every error must be placed in just the right 
spot along a perfectly straight line for the chain to achieve the necessary 
endpoint separation. In general, the fraction of chains starting from a given 
vèrtex and capable of generating an NC loop on our k x k lattice will be 
so me function f(n), tending to unity as n gets large. This function should 
multiply the chain counting function h,2(n) in our failure probability bound 



We know the vP^ = (k/2) 13 contribution to F should go like p n(fc) because 
once a starting point is picked, the positions of the errors are essentially 
all fixed if the chain is to generate an NC loop. Thus f(n^) = (p c ) n<k) to 
get the right term for n = in (J2.8|) . Also f(n) < 1 by definition, so we 
can bound 



f(n)< 



n 1 " 3 -k/2 f 
n — k/2 



for some n > (k/2) 13 . Using this in ()2.8|) with h 2 (n) multiplied by f(n) and 
finding the maximum term in the sum allows us to sharpen (j2.8|) insofar as Uq 
exceeds (k/2) 13 . Physically n represents the saturation point at which adding 
one more error to a chain stops having so great an effect on the chances of 
its being able to generate an NC loop. To get a hold on the value of n 
first consider only geodèsic error chains — chains of extremal length for fixed 
endpoints. Statistically this category will be dominated by nearly diagonal 
chains. But a diagonal chain must have length at least k to generate a NC 
loop, so adding one more error will be irrelevant only if n > k 13 . In general, 
error chains will be sub-geodesic, so that we expect the saturation point n 
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to exceed k 13 . Using this as a bound, we find the maximum term in the sum 
of (j2.8|) . if it exists, satisfies 

l 

W~k/2)_ 

where l = n 1 ^ and we have neglected 0(p c ) corrections. If this is satisfied 
by no Z < k, the end term with l = k is the maximum term. Since the above 
function is strictly increasing on < / < k, this occurs if p/p c exceeds the 
right hand side above evaluated at l = k, which is e _2///3 ~ 0.0420. So we 
have the estimate 

/ \ kl3 

F ~ k 2+ ? ( ^\ (2.9) 

where the exponent k^ applies rather than (k/2) 13 if p > 0.0420 p c . 

We have sought to test these results through numerical simulations of the 
recovery process. For a given torus size k, we perform ~ 10, 000 individual 
recovery simulations for eight vàlues of p from 0.01 to 0.07. Each Monte 
Cario run starts by generating a random pattern of errors, each edge with 
error probability p, and implements the expanding diamonds algorithm until 
all partides are paired. Recovery success or failure is determined by checking 
for NC loops. For each p, F is just given as the failure frequency, which we 
fit with (J2.9)) as a function of p, for fixed k, yielding the Ar exponent as a 
fitted paramter value in a log-log plot. (Here we neglect the prefactor k 2+l3 .) 
The logarithms of these extracted k 13 vàlues are plotted against k in Fig. 12.51 
According to ()2.9|) . the result should be a line with slope (3 = log 3 2 ps 0.6309. 
The measured slope is quite close: 0.627 ± 0.008. The intercept — predicted 
as the log of the coefficient of k 13 in (J2.9)) . hence zero — is measured to be 
0.02 ± 0.03. Measured vàlues of the accuracy threshold p c for each k are all 
comfortably consistent with the bound p > 1/75.38 obtained above. 

In picking the data to fit, we must select a maximum p (or, equivalently, 
F) since the actual scaling for which (J2.8|) is a bound must break down at 
some p greater than the bound obtained for p c . Here, a cut-off at F = .05 
was applied. We also select a minimum F to limit Poisson scatter, chosen 
to minimize the Standard error for our measured value of (3. Scatter in the 
plot arises not only from Poisson fluctuations but also from the fact that 
actual vàlues of f3 for finite k differ from the theoretical value log 3 2, which is 
really an asymptotic (k — > oo) prediction. These finite k effects involve the 
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log(|log F|) 




Figure 2.5: Recovery failure rates F as a function of lattice size k, obtained from 
Monte Cario simulation of errors. The vertical axis is taken as the logarithm of 
k 13 vàlues extracted from failure rate data and is effectively a double logarithm of 
F. The line represents our analytic result for the exponent (3 = log 3 2. 

interpolation which must take place between Cantor chains whose lengths 
are all powers of 3. 

2.4 Recovery with Imperfect Measurements 

Until now we have assumed perfect syndrome measurements. But suppose we 
err in measuring each star operator A s with some probability q (which might 
be the same order as the physical bit error probability p) . This mistake would 
lead us to think a particle ( "ghost particle" ) exists at s when there really is 
none, or that no particle exists at s ("ghost hole") when there really is one. 
Because the bàsic expanding diamonds algorithm becomes unstable when 
ghosts are introduced, we must modify it and apply our failure probability 
analysis (chain counting, etc.) to the modified version. 

Imagine recovery (with measurement errors) via expanding diamonds. 
We would generate contractible loops, hence correct real errors, but recovery 
chains would also connect ghost partides to one another and to real partides. 
Once a chain connects to a ghost particle it can no longer propagate from 
that endpoint because there is no pre-existing error chain to continue it to 
another particle. So in addition to all the loops generated by recovery, the 
lattice would also be left with open chains that carry over to the next round 
as if they arose from spontaneous errors. Failure might occur, as before, by 
the generation of an entire NC loop in one recovery round or, now, over many 
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rounds. 

Unfortunately, the left-over chains quickly begin to dominate the failure 
rate. Suppose q were small enough that in a particular round just two ghosts 
occurred. They would typically be separated by a distance 0(k) on the 
lattice. Since they are the only ghosts around, and open chains end only 
on ghosts, these two will be connected by a recovery chain, which will give 
an 0(1) chance of failure in the next round, independently of k. We cannot 
remedy this situation simply by repeating syndrome measurements a number 
of times to increase confidence in their results. The reason is that no matter 
how many times we repeat, there will always be processes involving just a few 
errors and ghosts (hence occurring with bounded probability) that corrupt 
the supposedly verified syndrome. For instance suppose, exactly half-way 
through a series of repeated rounds of syndrome measurement, a real error 
occurs with endpoints r and s, but we err in measuring A r . Majority voting 
after the final round would trick us into accepting s as a real particle, but 
not r, effectively generating a ghost in our verified syndrome. 

So we need a better algorithm. The first thing to realize is that we will 
inevitably leave open chains behind from one round to the next. The only 
way to prevent ghosts from generating long chains is to be suspicious of 
calls to connect widely separated partides. Suppose we are lead to consider 
generating a recovery chain between two partides separated by a distance / 
in the Tth round. Should we do it? If / is large, the hypothetical chain is 
more likely a pair of ghost partides. But age is also important: the longer 
the partides have been around (left unconnected in previous rounds), the 
less likely they are to be ghosts. To keep track of particle age information, 
imagine a 3d lattice comprising 2d shelves representing successive rounds of 
syndrome measurement. Partides which are the endpoints of left-over chains 
will be registered from their birth to the present, forming vertical "world 
lines." Pairing partides in the current round should be done by reference 
both to their spatial separation on the current 2d shelf and also their temporal 
separation, i.e., the number of rounds having elapsed between their respective 
births. 

Ghosts can eclipse partides or join onto chains themselves, either way 
causing an age discrepancy between chain endpoints. However we attempt 
to correct these types of errors, there is always the additional possibility that 
we generate more of them ourselves. As we shall see, the kinds of processes 
which result in chains with large spatial displacements in 2d have analogs in 
3d which generate large temporal displacements. And as before, the more 
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defects (now including ghosts), the greater these separations can be. This 
suggests we treat temporal and spatial separations on the same footing: the 
algorithm should connect partides according to some definite combination of 
their spatial and temporal separations. A natural generalization of expanding 
diamonds is found by extending the 2d spatial mètric on the lattice into a 
3d space-time mètric: 

k = l + a\AT\ (2.10) 

defines the 3d distance /* in terms of the spatial and temporal displacements l 
and AT. Diamonds in 2d become octahedra in the 3d lattice — an octahedron 
of radius being defined as the locus of points separated from a given vèrtex 
by a distance in (Z* — 1, /*] according to the "*-metric." Note these octahedra 
are squashed in the time direction by the factor a. The value of a should 
be chosen according to the frequency of measurement errors relative to real 
errors. The smaller the rate of measurement errors, the less probable it is to 
generate age differences, so a should be higher. 

At each step in a given round of recovery, scaled octahedra of fixed size 
are extended around the birth sites of partides currently available. Once 
a given particle's octahedron encounters another particle's birth site, the 
partides are paired and marked as "unavailable." In 2d, a recovery chain 
would be applied between every particle pair. Now that is inadvisable due 
to the presence of ghosts. Having paired two partides r and s, we should 
determine whether it is more probable that either (i) they are associated 
with two independent error chains whose other endpoints may have been 
obscured by ghosts, or (ii) r and s are in fact endpoints of the same chain. 
These probabilities are determined by the number of defects necessary to 
account for r and s under the assumption (i) or (ii), so we should find a 
3d analog of the 2d result that at least l 13 errors are necessary to generate a 
chain of length /. As we shall see, the obvious generalization is approximately 
correct: Zf effective defects, accounting through a for the different occurrence 
probabilities of real errors and ghosts, are necessary to generate a chain of 
length U in the *-metric. In addition we will find, as would be expected, 
that at least ~ T 13 effective defects are necessary to maintain a chain in 
existence for T rounds of recovery. These two results allow us to compare 
the two probabilities associated with cases (i) and (ii) above. This is done by 
comparing the number of effective defects required for each case, which are 
T/ 5 + Tf and 1% respectively. Here T r and T s are the ages of the two partides 
r and s, and is their *-metric separation. Once the 3d algorithm is done 
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pairing partides, we apply a recovery chain between any given pair r and s 
whenever 

12<T? + Tf, (2.11) 

which imposes a variable pairing-length cut-off on the algorithm. 

Ambiguities in this 3d algorithm can arise in the process of identifying a 
current particle with a particular birth site. If errors occur on edges touching 
the original birth site, the particle's vertical world line may continue on a 
vèrtex displaced from the original. Also, ghosts may eclipse a particle in a 
given set of rounds, leaving holes in its world line. These difficulties may be 
overcome on a round-to-round basis by simply requiring that the age ascribed 
to a vèrtex be conserved if its particle has been left over from previous rounds, 
i.e., has not yet been paired. If a left-over particle suddenly disappears, we 
probe with expanding diamonds around the eclipsed particle until we find 
an uneclipsed particle who could inherit the lost age. If the probe radius 
becomes large enough that the likelihood of eclipse due to ghost overtakes 
the likelihood of eclipse due to real errors, we conserve age by manually 
adjusting our syndrome record as if we had detected a particle at the vèrtex 
in question. (We continue to alter the syndrome by hand, if need be, until it 
becomes more likely that the hypothetical eclipsed particle is actually just a 
string of ghosts.) 

Having now specified an algorithm in 3d, we must redo our failure rate 
analysis taking into account the time dimension and the leaving over of chains 
from one round to the next. Our method is basically the same as before, but 
the chain counting function h^in) must be generalized to h 3 (n,n), where n 
is still the number of real errors and fi the number of ghost errors involved 
in the chain. The failure probability bound now becomes 

F < k 3 ^ h 3 (n, n)p n q n (2.12) 

n,n 

where the sum is taken over all pairs (n, n) capable of generating an NC loop. 
Note that h 3 (n, n) counts chains involving errors which may have originated 
in previous rounds but have lasted through the present. We again obtain 
a recursion relation, now for hz(n,n) in terms of h 3 {m < n,fh < n), by 
considering all ways an [n, n]-chain could be broken into an [m, m]-chain S 
and an [n — m,n — m]-chain 5". Recali that in ld the coefficient (m< 13 + 1) in 
the recursion relation ()2.3|) counted all the ways to choose the recovery chain 
connecting S and S'. In 2d this coefficient became the area of a diamond of 
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radius m < . And now in 3d it becomes the volume (in vertices) of a *-metric 
octahedron with radius l*(m,fh) < , defined as the lesser of the two maximal 
*-metric lengths /*(m, fh) and Z*(n — m, n — fh). The new recursion relation 
is 

n n 

h 3 {n, h) < V{U{m, m)<)/i3(m, fh)h 3 {n — m,h — fh). (2.13) 

m=0 rh=0 

Again we want to bound the sum by finding the maximum h ■ h term, 
now varying both m and fh. By generalizing the ld/2d relation l{n) = n 1 ^ 
we will later see that chains can grow longest when ghosts are uniformly 
intermixed with real errors. It turns out they work best by cooperating, as 
opposed to, say, having all the real errors combine on one side of the chain 
and all the ghosts on the other. So a maximal chain, hence the maximum h ■ h 
term, must have uniform composition, m/fh = n/h. Thus we can perform a 
calculation similar to that which gave ()2.4|) . but with l(m) — > /^(m, (h/n)m). 
In fact we need just observe, as will be shown later, that U{m, (h/n)m) grows 
faster with m than does l(m). Now l$s' ~^ hss 1 is determined by the first 
equality in (J2.4)) . which still holds in 3d, so that if lss'i m ) was increasing on 
l<m<n/2in ld/2d, so must be l*ss' m 3d. Thus the maximum is located 
atm = n/2, hence fh = h/2, and (j2.13|) becomes 

h 3 (n, h) < E 3 (n, h)h 3 (n/2, h/2) 2 , (2.14) 

where 

n n 

S 3 (n, h) = ^ ^2 V ( l *( m ' m )<) 

m=0 m=0 

and the octahedral volume is given by 

[h/a] 

V{U)= 2 [Z, - qí|AT|] [Z, - a|AT| + 1] + 1, 

AT=-[l»/a] 

with [■ • ■] denoting the greatest integer function. The chain counting function, 
hence the critical probabilities we shall soon bound, depend crucially on the 
function l*(n,h) which bounds the *-metric length of a chain containing n 
real errors and h ghosts. As there is no compact expression in general, we 
need to investigate particular vàlues of n and h. 
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The bàsic constraint on the length of an error chain is triat none of its 
recovery chain components can be longer (*-metric) than either of the sub- 
chains which it joins. Consider the case n = 0. Even without any ghosts, 
the chain has extra freedom in the 3d lattice. Two purely spatial sub-chains 
may be joined by a purely spatial recovery chain (not exceeding either of 
their lengths), or they may trade space for time so that one chain occurs in a 
recovery round before the other. But no extra *-metric length can be gained 
by trading space for time, because for any "time-like" recovery chain there is 
always a "space-like" recovery chain of equal or greater length. This implies 
ï*(n,0) = n 1 ^ just as in 2d. Now consider adding one ghost to a pre-existing 
error chain. If, for instance, a ghost at s is connected to one of the chain's 
endpoints, s will show up as a new-born particle in the next round (which 
would not be the case had the ghost been a real particle, hence the endpoint 
of another chain). Thus, the ghost generates an age difference between the 
endpoints of that chain. Depending on the value of a, the algorithm might 
permit a newborn ghost to join onto an older chain, causing a greater age 
difference. To simplify analysis let us fix at so that no newborn (i.e., age 1) 
particle may be joined with a particle of age greater than 2. Consider two 
partides of ages 1 and 3, separated by one edge. The condition that they 
cannot be joined by the algorithm is given by the pairing length cut-off (|2.11|) 
as (1 + 2a) /3 > 3 13 + = 3, comfortably satisfied by choosing a = 2.4. From 
this it follows by checking cases that the most an add-on ghost can extend a 
chain is to add a spatial separation of 2 edges and a final age difference of 2 
rounds. If n ^> n, the maximal chain has a "unit celi" comprising n/n real 
errors arranged in a Cantor chain with one ghost added on to the end. Unit 
celis are strung together as singe error units in the Cantor pattern, giving 

k(n > n) = n l/li + (2a + 2)n 1//3 

which retains the bàsic 1/(3 scaling exponent. Considering n and n powers 
of two, one can check that this relation holds for n > 4n. If n and n in this 
range are not powers of two, the above may be taken as a bound. Also, one 
may experiment with smaller unit celis to obtain 

h(n = 2n) = (3 + 2a)n 1/f3 and h(n = n) = (2 + a)n 1/f3 . 

We may interpolate between the above three results by an appropriate step- 
function to achieve a bound on l*(n,n) for all n > n. For n = 0, one can 
investigate candidate maximal chains with a defmite number of ghosts per 
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unit celi. Each unit celi is gotten by saturating the cut-off (|2.11j) . It turns 
out the true maximal chain has six per unit celi and scales according to 

k(0,n) = (7 + 8a)(n/6) 1//3 . 

For iïCíi, the maximal chain unit celi has n/n ghosts and one real error. It 
may be divided into many sub-units, each comprising six ghosts, except for 
one odd sub-unit which also has the one real error. Actually, we can make 
the unit celi a bit longer by giving one of the six ghosts in the odd sub-unit 
to a different sub-unit. This construction gives 

k( n < n) = (6 + 2a)n 1/l3 + (7 + 8a)(n/6) 1/(3 , 

which holds as a bound for n < n/6. The *-metric lengths of chains for 
n/6 < n < n may be obtained by inspetion when n/n is integral. We will 
not overtax the reader with all these formulas. Again, is bounded by 
interpolating to a step-function for intermediate cases. Altogether we have a 
means of bounding Z*(n, n) for any vàlues of its arguments. Note that these 
expressions for l^(n,n) have been obtained by mixing the n real errors and 
n ghosts uniformly, hence the "unit celis." That uniform mixture maximizes 
*-metric length can be checked by comparison to the lengths, computed with 
the above formulas, of segregated chains comprising, e.g., one piece with only 
real errors and another with only ghosts. 

Now that we have a handle on all the quantities involved in our recursion 
relation (|2.14|) . let us use it to bound h 3 (n,n). First consider the case n = 
n/n > 1. Recursion brings us down from h 3 (n,n) to the factor h 3 (n/n,l), 
assuming both n and n are powers of two. And we have 

h 3 (n, 1) < 2 E 3 (n, 0) h 3 (n/2, 0) h 3 (n/2, 1), (2.15) 

expressing a division into two sub-chains of equal defect number, except that 
one has a ghost and the other does not. The factor of 2 arises because the 
one ghost may be put in either of the two sub-chains. We may recursively 
substitute for h 3 (n/2, 1) in this relation to obtain an expression involving 
no h factors other than h 3 (2 N , 1) and those of the form h 3 (m, 0). (The 
integer N may be chosen freely.) The h 3 (m,0) terms may all be reduced 
to h 3 (2 N , 0) using the original relation (j2.14|) with n = 0. When account 
is taken of all the S 3 factors produced by these recursions, one finds the 
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quantity q n h 3 (n > 2 N n) is bounded from above by 
nq h 3 (2 N , 1 



2 N h 3 (2 N ,0) 



II S 3 (2 L n,2 L ) 2 " 



L=N+1 



log 2 n 



h(2 N ,0f- N I] S 3(2 L ,0) 5 



L=iV+l 



(2.16) 

(For n <2 N the second product over L should be set to unity.) Note that the 
bracketed expression {■ ■ ■} has the form g(n, q), depending on n and n only 
through n = n/n. Strictly, we have obtained ()2.16|) only for n a power of 
two. Calculating intermediate cases, we would expect to find a correction to 
(I2.16|) resembling the correction (J2.2j) to our ld/2d scaling law l{n) = n 1 ^. 
We can now express the n > n part of the sum in ()2.12|) as 



n>n 



n>l n 



where the first sum is only over pairs (n, n) capable of generating NC loops, 
which we have converted to a sum over n, n. For a given n the sum over n 
begins at a defmite value n = (k/2 , y(n)) 13 , which is determined by one of the 
> n) formulas as the minimum n such that n real errors together with 
n = n/n ghosts can generate a chain of spatial length k/2. In particular, 
7(n) is maximum at n = 1 where the Z*(n = n) formula gives 7 = 2 + a. 
Considering the sum over n as already performed, terms in the remaining 
sum over n depend on p, q, and h alone. Fixing p and q, the maximum 
term will occur at some n = n(q,p), which then determines an asymptotic 
{k — > 00) critical probability p c (q,p) = í/g(n(q,p),q) and scaling exponent 
through , y(h(q,p)). 

The case n < n follows in the same way, and a bound is obtained for 
the quantity p n h 3 (n < n) which is exactly the expression (j2.16J) with q «-> p 
so that the two arguments are interchanged in all the h and E functions, 
n <-> n, and n — > 1/n. We may denote the resulting expression inside {• • •} by 
g(n,p), which gives rise to a measurement error critical probability q c {q,p) = 
l/g(n(q,p),p) and scaling exponent (k/2j) 13 . Thus the failure rate bound in 
3d is 



F < k 



P 



.Pc(q,p) 



. 27 , 



+ k 3 



q 



Qc(q,p) 



. 27/ 



C2-17) 



where 7,7 < 2 + a both depend on (q,p). The accuracy threshold is no 
longer a single point p c as in the case of no measurement errors, but is now a 
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definite curve in the gp-plane — the boundary of the region lying underneath 
the two curves given implicitly by p = p c {q,p) and q = q c (q,p)- This is a sort 
of phase boundary between the well-ordered, sub-threshold state where long 
error chains are exponentially improbable and the disordered state where long 
chains occur frequently and the encoded information is quickly corrupted. 

To obtain this threshold curve one could calculate p c (q,p) and q c {q,p) 
directly, or choose a far less intensive method which is to calculate g(iï > 
l,q = 1) and g(n < l,p — 1) for a number of vàlues of n and use the 
fact that g(n,q) = g{n,l)q 1 / n and g(n,p) = g(n,l)p n to obtain curves in 
the p-q plane corresponding to thresholds for individual contributions to F 
from chains with fixed error-ghost composition n. The region underlying 
all these curves is exactly the sub-threshold region. We have numerically 
calculated g(n = 2 M , 1) and g(n = 2" M , 1) for M = 0, 1, . . . , 16, oo. In 
these calculations we set the recursion limit TV in (I2.16J1 by N = min{M, 4}. 
This means we need to calculate bounds on h 3 (2 L ,m) and h 3 (m,2 L ) for 
L = 0, 1,2; m = 0, 1. We reduce h 3 (4:, 1) by one application of (|2.15jl . and 
/i3(l,4) by one application of the analogous relation with q <-> p. Bounds 
on the remaining h^s are again obtained by inspection of diagrams. For 
example, the diagramatic break-down of hs(4,0) is almost identical to that 
of /i2(4) in 2d which we have already calculated. The only additions are 
diagrams involving errors distributed over múltiple rounds of recovery. In 
particular, by reference to (|2.1(J|) and (|2.11|) one finds 

h 3 (4, 0) = h 2 (A) + (<-►--«-►)(«-►«-►) = 4997 + (4 • 3 • 3)|(4 • 3) = 5105 

where the chain within the first parentheses occurs one round before the 
chain within the second. 

The vàlues obtained for log(g) and log(g) are roughly linear in 1/n and 
n respectively. The points n = 2 ±M chosen for these calculations possess 
high symmetry in the same way the points n = 2 M posses high symmetry in 
regard to the ld/2d scaling function l(n). Expecting for g and g a similar 
kind of peak-structure around points of high symmetry as was observed in 
ÍI2.2|) for l(n), we use linear interpolation for reasonable bounds on points 
intermediate between n = 2 ±M for M — 0, 1, . . . , 16, oo. The resulting sub- 
threshold region is a foot-like area with its heel at the origin of the gp-plane 
(see Fig. l2.6|) . The phase boundary has three main parts, coming respectively 
from the contributions to F corresponding to n = 2,1, and 1/2, hence to 
chains of twice as many real errors as ghosts, of equally as many, and of half 
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Figure 2.6: Accuracy threshold for recovery with faulty syndrome measurements 
represented as a phase boundary in the qp-plane. Each curve correesponds to error 
chains of a different composition h, and the region underneath them all is that in 
which recovery is stable. 

as many. The ankle is cut-off around p = 1/114.5 by threshold curves for 
higher n, hence more real errors, and the toes are cut-off around q = 1/115.3 
by curves for lower n, hence more ghosts. These two vàlues, then, represent 
the limiting accuracy thresholds achieved by the 3d "expanding octahedra" 
algorithm. 

Now these exponents (k/2 r y) lS and {k/2^Y are over-estimates, just as in 
2d, due to the conservative assumption that any chain with a sufficient num- 
ber of defects to form an NC loop will in fact do so. Recali in 2d consideration 
of the saturation point n for geodèsic chains suggested a conservative esti- 
mate of F with exponent (k/2) 13 replaced by k@ if p > 0.0420p c . The same 
arguments apply in 3d, except geodèsics may now move in the time direction 
as well. For a long geodèsic chain, the 2+1 components of its displacement 
will each average to the same *-metric length. An NC loop can be generated 
when this length is k/2, so the total *-metric length of the chain is 3k/2. 
Consider a defmite radial line in the (q,p) plane, on which F is dominated 
by one particular n threshold. The saturation point analysis here is essen- 
tially the same as in 2d, and one finds the 3d exponents are improved to 
(Sk/2-f) 13 and (3k/2^) /3 iïp/p c > 2e" 2//3 « 0.0840. 

Numerical simulations with measurement errors are performed, leaving 
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over chains from one round to the next. Recovery failure is assumed to occur 
if either an NC loop occurs or an error chain's endpoints achieve a spatial 
separation of k or more (which would shortly lead to an NC loop). After a 
failure, the lattice is reset to an error-free state, and recovery resumes. The 
failure rate is calculated as the the number of failures divided by the total 
number of rounds. To speed up simulations the expanding octahedral radii 
are incremented in steps larger than one depending on the ages of current 
partides. The increment is chosen so that only five steps are necessary per 
round independent of k, which was not observed to significantly affect per- 
formance. In these simulations we set q = p/2, with k ranging from 10 to 60 
for each (q,p). The critical behavior is understood from our gp-plane (Fig. 
12. 6|) by walking out from the origin along the line p = 2q. This line happens 
to cross the phase boundary right near the edge of the segment dominated 
by the n = 1 threshold curve, corresponding to 7 = 2 + a. Since this is the 
maximum possible 7, F will indeed be dominated by the n = 1 threshold 
in this edge region. Thus our theoretical prediction here, with saturation 
improved exponent, is 



neglecting a polynomial prefactor. Here p c , specific to the case p = 2q, is 
at least 1/329.8. This result for the exponent (3k/2 / y)^ is shown as a line 
alongside the simulation data in Fig. 12.71 The agreement here is not as good 
as in the absence of measurement errors, but that is to be expected given 
the added complications of the 3d algorithm. These would presumably tend 
to enhance the finite k deviations from our asymptotic predictions. Still, 
our conservative estimate is good, with only a couple of the data points 
completely below the line (indicating failure rates above the estimate). 

Bravyi and Kitaev [20 j and, independently, Freedman and Meyer ^0] have 
exhibited lattice codes like TOR(fc) but using simply connected lattices with 
boundary in the plane. This presents a major advantage for any ultimate 
experimental implementation. A recovery algorithm appropriate to these 
codes is just that given above, but modified to account for the possibility 
of error chain endpoints being hidden on the boundary. Since the boundary 
is asymptotically irrelevant to scaling properties of lattice codes, the above 
results would seem to carry through. 
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Figure 2.7: Recovery failure rates F as a function of lattice size k from Monte 
Cario simulations including measurement errors ( "ghosts" ) alongside the theoret- 
ical prediction (line). 



2.5 Lattice Codes on High Genus Surfaces 



In Kitaev's topological framework for quantum codes, the bàsic determiner 
of a code's fidelity 1 — e in maintaining encoded information is the length L of 
(i.e., the number of edges contained in) the shortest possible non-contractible 
loop on the lattice. In particular, we have seen that the code's failure prob- 
ability scales as 



where p is an error probability for physical qubits, and p c , (3, and K are 
parameters depending on the particular error correction algorithm used. 

2N qubits may be encoded in iV separate lattices, each with fidelity given 
above. What we will show here is that, if instead of iV separate lattices we 
combine them into one large lattice on a high genus surface constructed by 
a certain method, the information rate and/or fidelity may be improved as 
the number of encoded qubits increases. 

As a motivating example, consider joining two L' x L' toric lattices by 
removing a Z//2 x L'/2 square from each and sewing together the perimeters 
of the resulting square holes; it is straightforward to define a code on this 
new lattice preserving the total number, four, of encoded qubits. Taking 
L' 2 pa 4L 2 /3, the number of physical qubits is nearly the same as for two 
separate L x L lattices, but the minimum length of non-contractible loops is 
now pa 2L/y/3, improving the code's fidelity. 

This suggests, given N separate L x L toric lattices, we combine the 
2L 2 N physical qubits into one large high-genus surface on which each torus 
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becomes one handle. We can increase the code's fidelity by the following 
construction. Cut each handle through its width, along a "w-loop" (see Fig. 
12. 8|) . giving two loose ends per handle. Then randomly re-pair the set of 2N 
loose ends and rejoin each pair. 



Figure 2.8: A simple w-loop and simple 1-loop are shown on two different 
handles. 

Before cutting and rejoining w-loops, the lengths of length-wise "1-loops" 
had been as small as L; now the shortest simple 1-loops for a typical handle 
will have length ~ LN. If each handle is made to encode a qubit with Z 
operator given by an 1-loop, it appears the chances of a Z error to these 
encoded qubits has been markedly diminished. One might like to say this 
error probability is now exponentially small in (LN) 13 . But there are more 
complicated 1-type loops with lengths much smaller than LN; each such 
loop involves many handles. The minimal 1-loops of this kind determine the 
encoded Z error probability of a lattice code based on this large surface. 
Let us first determine the lengths of minimal 1-loops, then symmetrize the 
construction of our surface to handle encoded X errors as well. 

To characterize the minimal 1-loops we will calculate some geometrical 
properties of our high-genus surface, on which it will be convenient to recali 
the square lattice of qubits. If each handle connects to the surface through 
a square patch, the lattice will be locally identical to a simple square lattice 
except at the còrners of a square patch. Each còrner vèrtex has valence 
(number of edges containing it) equal to five not four, giving five quadrants 
of 2d space (see Fig. l2.9|) . The còrner constitutes a kink of negative curvature 
on the otherwise flat lattice. 

Around some vèrtex P draw a small diamond. As its radius r increases, 
the diamond will encounter handles over which it must climb, extending out 
to new places on the surface. From an intrinsic perspective, the diamond 
merely sees an occasional kink, from which emanates an extra quadrant of 
space. Having passed over a kink and encroached into its extra quadrant, 
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the diamond's perimeter will become larger than it would be apart from the 
kink. It is not hard to see the perimeter will contain an additional r — 
vertices, where ?\ is the distance from P to the kink. 

As the diamond expands, it encounters more kinks and its perimeter 
grows even faster. Moreover, all kinks contribute independently to the perime- 
ter. Assuming a constant density, 8/L 2 per vèrtex, of kinks on the lattice, 
the perimeter c(r) approximately satisfies the recursion relation 



obtained by adding independent contributions from all kinks within the di- 
amond; c(r) = 4r would be the result in flat space. The above relation can 
be cast as a second order finite difference equation, with initial conditions, 
whose solution is approximately 



To obtain the minimal 1-loop length for a given handle H, consider the 
set of open paths of length r and starting at some vèrtex P around the base 
of H. As r increases, the diamond forming the outer boundary of this set 
will be pushed across various handles to random new places on the surface, 
gradually filling it up. Once a path encounters the other end of H, it can 
be closed across H to form an 1-loop. The chances there will be such a 
path become significant only when the area of our r-diamond approaches a 
significant fraction of the total surface area, L 2 N. Obtaining the diamond's 
area by summing (|2.2Uj) . this condition is found to be r ~ L \ogN/ y/8, which 
thus gives the minimal 1-loop length for almost all handles on the surface. 
As for the other handles, one can either attempt to re-pair them or simply 
discard them (close and eliminate them as handles). 

The probability of encoded Z errors associated with 1-loops is thus expo- 
nentially small in (Llog N)@; however no improvement has been achieved for 
X error correction. To symmetrize the above construction, we simply add 
an additional cut-and-pair step. Previously we had cut along a w-loop on 
each handle and then re-paired all the cuts. Now we cut along an 1-loop, 
which may be as short as ~ L log N, and randomly re-pair as before. There 
is no longer a simple w-loop that can be drawn encircling a given handle. A 
w-loop must proceed across many of these cuts before it can close on itself 




(2.19) 



r k =0 




(2.20) 
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non-contractibly. Topologically, this new step is identical to the previous one 
but with the surface turned inside-out. The effect of these new cuts and joins 
is just a doubling of the kink density to 1Q/L 2 in ([2. 19)1 . giving the minimal 
loop length — now for 1-loops and w-loops — as ~ LlogiV/4. 

This result can be applied either as an improvement to the L, iV-dependence 
of the lattice code's fidelity at fixed information rate, an improvement to the 
information rate at fixed fidelity, or as a simultaneous improvement to both. 
But there is another parameter to be considered: the accuracy threshold p c . 
Indeed, this convoluted surface topology suggests a greater variety of possible 
catastrophic error processes (long error chains) which would tend to worsen 
the threshold. 

Nevertheless, as L is increased at fixed N the error processes affected 
by the convoluted topology will only be those involving longer and longer, 
hence less and less probable, error chains. As L gets large the threshold 
will then tend back to its original value. For the quasi-local error correction 
algorithm presented above with f3 = log 3 2, the effect of convoluted topology 
is to multiply our bound on the threshold p c by 

(l/3 fe )' 3 

~ gg-^OogAO 1 -* 3 /!/ 

where a(r) is the circular area obtained by summing (|2.2üj) . This means that 
N should not be increased faster than logiV ~ ]_piQ-P) or e i se the code's 
accuracy threshold may be greatly diminished. Thus the fidelity 1 — e from 
(I2.18J) will scale as 

-loge~(LlogiV) /3 ~L /3/{1 ~ /3) . 

Were we to naively put (3 = 1 above, we would find that there is no restriction 
on how N scales with L — unbounded gains could be had by increasing N at 
fixed L without serious damage to the accuracy threshold. In fact a different, 
global recovery algorithm does have (3 = 1, and explicit consideration of its 
performance under convoluted topology corroborates this result. Bounds on 
the threshold are obtained here by counting certain classes of paths on the 
lattice ^S]. For instance, the number of length r paths with given starting 
point on a flat square lattice is 4 r . On our curved lattice, some vertices 
(the kinks) have valence 5, so the number of walks is bounded by v r with 
4 < v < 5. The effect of convoluted topology on a threshold bound based on 
this counting would be to multiply it by the near-unity factor A/v. 
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We have thus shown how to achieve gains in the fidelity and / or efnciency 
of storing quantum information by encoding many qubits in one block of 
a topological quantum code. The code involves a lattice of qubits on a 2d 
surface of highly convoluted topology. As more encoded qubits are added, 
keeping fixed the number of physical qubits per encoded qubit, asymptoti- 
cally significant gains are obtained in the code's fidelity This is an economy 
of scale in the error correction hardware independent of any software gains 
achieved by compressing redundancy within the encoded information itself, 
as in Shannon's coding theorems and their quantum equivalents |16j . which 
rely on the encoded qubits' occupation of "typical" subspaces in the many- 
qubit Hilbert space. 

One beneficial feature of the original topological codes is that error cor- 
rection operations are local on the lattice; however, this is also a limitation. 
The convoluted topology of the above construction, which effectively destroys 
the codes' locality, is a way of overcoming this limitation (and sacrificing the 
associated benefit). 
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Figure 2.9: The kink K appears at the base of a handle. A diamond (dashed 
line) centered at P and containing K is shown. 
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Chapter 3 

Unconcatenated Quantum 
Computing 

3.1 Encoded Computation: The Toffoli Gate 

So far we have mainly addressed error correcting codes in terms of storing 
quantum information. To realize an actual quantum computer, it is necessary 
to show how quantum gates may be performed on encoded qubits. While spe- 
cific methods have been invented for a large class of concatenated quantum 
codes, the lattice codes we have discussed were designed specifically to ob- 
viate something inherent in the concatenation process: the genèric reliance 
on highly non-local gates. We must therefore attempt to devise methods 
for performing encoded gates that do not rely on the recursive structure of 
concatenated codes. 

Most known quantum error correcting codes (concatenated and lattice 
codes alike) can be defined by a set of operators, the stabilizer, each of which 
fixes every codeword. For a number of stabilizer codes capable of simple 
operations, like a bit flip X a or phase flip Z a on logical qubits a,b, . . ., it is 
also known how to perform any "normalizer" operation — i.e. one that can be 
composed from the C-NOT, which we will now denote by X a b, the ir/2 phase 
shift, and the Hadamard rotation R a . Normalizer operations alone, however, 
are insuflicient for universal quantum computation; a quantum computer 
with only normalizer operations can be simulated in polynomial time by 
a classical machine ^7j. A genuine quantum computer is realized either 
by the addition of a non-trivial one or two-qubit gate, like a single qubit 
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rotation by an irrational múltiple of 7r, or of a three-qubit gate like the 
Toffoli (controlled-controlled-NOT), which flips the third qubit whenever the 
first two are each |1). Because the necessary one or two-qubit gates require 
rotations (e.g., e l9X ) involving SU(2) angles that must be precisely tuned, 
they cannot be implemented fault-tolerantly. Small errors in these angles are 
inevitable and will gradually accrue until the computation veers completely 
off course. The Toffoli, on the other hand, like the C-NOT, does not involve 
any such rotations and therefore is suitable for fault-tolerant implementat ion. 

Peter Shor has given a procedure for performing a Toffoli given the 
ancilla state 

|^ 3 ) = |000) + |001) + |010) + |100), (3.1) 

which means essentially that possessing the tripartite entanglement of this 
state is equally as powerful as being able to perform a Toffoli gate. Let us 
review Shor's procedure (actually, a similar procedure that is equivalent to 
Shor's). 

As motivation, first consider the following construction, which uses one 
ancilla bit c. Letting c start as |0), suppose we could perform a majority vote 
on A Bc so that, forexample, |010) -> |000) and |110) -> |111). Equivalently, 
one might majority vote, but only carry out the effect on c, leaving A and B 
unchanged, so |010) -> |010) and |110) -> |111). Now just C-NOT c into C 
and disentangle c from ABC. The result is exactly a Toffoli on ABC. 

To majority vote on AB c, one measures Z^Zg and Z%Z C . If both mea- 
surement results are +1, ABc are already unanimous. Otherwise, the mea- 
surement results will reveal which bit is the odd-one-out. Unfortunately, 
these measurements have also revealed information about the initial state 
AB, in general collapsing it, inconsistent with the desired Toffoli gate, a 
linear operation. The solution is to perform a majority vote not directly on 
ABc but on three ancilla qubits, which are first entangled with AB. Here 
is where 1^3) enters. 

Given some arbitrary state of AB C, prepare ah c'm and perform the 
following operations: (I) C-NOT A into a and B into b, and (II) majority 
vote on abc (by measuring Z a Z b and Z b Z c and flipping the odd-bit-out if 
necessary). Suppose, for example, the measurement results from (II) are 
Z a Z\y = — 1 and Zj,Z c = +1. All but 8 terms will be collapsed away of the 
total 2 3 x 4 = 32 terms in the initial 6-qubit state. These 8 terms, as they 
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undergo (I) and (II), are (suppressing bra-ket notation): 

I II 

000)100 -> OOColOO -> OOCoOOO 

OlCiOOl -> OldOll -> OlCilll 
10C 2 000 -> 10C 2 100 -> 10C 2 000 

HC3010 -». 11C3100 -> 110000 

where Cj = 0,1. Note that all of the 8 possible bit vàlues for ABC are 
equally represented, so that all information in the initial superposition of 
ABC is preserved (albeit decoherently). Now C-NOT c into C. From the 
above table, this will flip C iff AB are 01 — not iff AB are 11, as desired for 
the Toffoli. Applying Xbc then gives the desired result. Finally, ABC must 
be disentangled from the ancillas a b c to restore the coherence of the original 
state. This is accomplished by applying X a b and X ac and then measuring 
X a . If the result is +1, ABC are disentangled. If —1, a phase error on 
the AB = 01 term has been introduced; it may be corrected by applying 
XaZabXa, where Zab = RbXabRb is the controlled-phase (C-PHASE) 
gate. 

Had the measurement results for Z a Z b and Z h Z c been other than —1 and 
+ 1 respectively, as in the above example, it is straightforward to determine 
what gates must be applied in place of Xbc an d XaZabXa- 

The Toffoli now just requires preparation of the three-qubit state 
First observe that if one can prepare 

|^2> = |00) + |01) + |10), 

|^3) may be obtained by preparing four qubits abcd in the state l^)!^), 
measuring ZbZ c , and performing a few simple normalizer operations. In 
particular, the measurement result —1 gives the state 

|0010) + |0100) + |0101) + |1010), 

which can be turned into l^)]!) by applying the C-NOTs: X ac , X db) X ad , 
Xm, and X C( i in that order. 
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3.2 Preparing the Two-Qubit Ancilla 

Let us define p(ai, a 2 , «3) as the (unnormalized) mixed state 

1 «i 

in the basis {^2), where \as\ < 1. It turns out, in the continuum of 

states p(ttj), there is nothing special about l^), obtained as ai — > 0. Being 
able to prepare any one state p(a;;) with |a 3 | < 1 is suficient to prepare \i/j 2 ), 
hence to prepare and construct a Toffoli gate. 

The state j^) is prepared by combining two copies of p(a,) through 
measurement to obtain a new mixed state which is closer to j^) than before, 
and combining two of these to get one still closer, etc, progressively purifying 
\ip 2 ) from the initial states. To start, prepare qubits abcd in the state Po®Po> 
where po = PÍ&í), and measure Z a Z c and Z^Zd- Suppose the results are +1 
and +1. Now perform X ac and X« to disentangle cd. For pure states, 
this whole process would give ^2) ^2) - ► |'02)|OO) and 1 11) 1 1 1) — > |11)|00), 
while either of the initial states |^2)|H) or |11)|^2) are inconsistent with 
the assumed measurement results. In terms of mixed states, this means 
Po ® Po — * Pi ® |00)(00| where p x is 

~ 1 a\~ 
_a\ al _ 

which is exactly p(af). Prepare another pi from two new p states, and 
combine the two pi states by again measuring Z a Z c and Z^Zd- Supposing 
the results are again +1 and +1, cd are disentangled, leaving ab in the 
state P2 = p(otf). Continuing this process through N levels gives pn = 
p(af ). The whole procedure may be pictured as a tree of pi states, joining 
in pairs from level L = to L = N (see Fig. 13. The recursiveness is 
reminiscent of concatenated codes, but here the complexity appears in the 
auxiliary purification process, not in the code itself. 
The fidelity in preparing \íp 2 ) is 

1 _ = tr(pjy |^2)(^2|) 3 

C " tr(p N )(^ 2 ) 3 + af l ' J 

very close to 1 if |ct 3 | < 1. The number of (logical) qubits used to achieve 
this fidelity is ~ 2^, which by (J3.2j) is ~ loge/ log | ck 3 | . This is the same 
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kind of polylog scaling desired from the code itself (referring to the scaling of 
block size with desired failure rate e). Finding the number of operations on 
encoded qubits necessary to prepare \ip2) is n ot as easy, since the assumption 
that all Z a Zb, Z c Zd measurement outcomes are +1,+1 requires repetition of 
the procedure a number of times before one expects such to occur. 

To prepare a single p^ state prepare two pi-i states and then combine 
them by measurements. If the measurement results are not +1,+1, just 
discard these states and keep trying. (This is not an optimal procedure, but 
it will suffice.) Therefore, if the chances of any one attempt succeeding are 
P(L), the expected number of logical operations G(L) necessary to prepare 
Pl is ~ 2G(L — 1)/P(L). This assumes high confidence in the one pair of 
measurement results +1,+1, which should be the case since abcd are logical 
qubits. But even if there is a significant probability e m ^> e for any one 
measurement result to be in error, the purification procedure can be made 
robust. Once a +1,+1 result is obtained, just repeat the measurements a 
number of times and accept the state only if, say, a majority of the results 
are +1,+1. To get 1 — e confidence in the measurement outcome, one must 
repeat ~ loge/loge m times. This implies 

G{L) « G{L - 1) + ^ . (3.3) 
P(L) loge m 

It is not hard to see that P(L) must increase with L, since this recursion 
relation implies that either \ip2) will quickly begin to dominate successive pi 
states, in which case P{L) — > 1/3, or |11) will dominate and P{L) — > 1. Both 
of these vàlues are larger than -P(l), which can be calculated as a function 
of | QT3 1 < 1 but is always bounded from below by 1/4. Iterating ()3.3|) with 
this bound gives 

G(N)~8 N ^1_ ^g£T (34) 

loge m (log|a 3 |) 3 loge m 

Note that G(N) is the total number of logical operations, but these can be 
done in parallel so that the actual purification time is ~ iVloge/loge m ~ 
log(| loge|) loge/ loge m . The point is that even with the demand of a definite 
sequence of measurement results, time requirements still scale polylogarith- 
mically with e. The crucial fact leading to this scaling is that the probability 
for getting the measurement results +1,+1 in combining two pi states is fi- 
nite as L — > 00. Thus one can prepare |^ 2 ), | ^3) , and execute a Toffoli gate 
if one can prepare one of the mixed states p(ai) with |a 3 | < 1. 
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There are múltiple ways of obtaining a state p(|cü3| < 1) for codes which 
are not too large. In fact, Shor's own procedure for preparing can do so. 
An alternative method will be presented here, applicable to codes possessing 
a non-trivial normalizer operation (here C-NOT) that is transversal, so the 
encoded operation factors into a number of independent operations on phys- 
ical qubits. The method works by performing a very noisy measurement of 
the C-NOT operator. 

3.3 Noisy Measurement of C-NOT 

Were it possible to measure C-NOT with high fidelity, one could easily pre- 
pare \ip2) = PÍ&í = 0). It turns out imperfect measurement of C-NOT is still 
capable of yielding p(|«3| < 1). 

For reference, the eigenstates of the C-NOT operator X ab are 1 00) , |01), 
and 1 10} + 111) with eigenvalue +1, and 1 10} — |11) with eigenvalue —1. Let us 
first describe a fault-mtolerant measurement procedure, that is, one which 
permits a single error to spread rampantly throughout a block. Prepare one 
physical ancilla bit c as |0) and apply a certain three-bit gate U aibiC0 bitwise 
over physical bits ei; and 6; in the blocks encoding a and b (but always using 
the bit Co) . U is shown in Fig. 13.21 The first Hadamard rotation causes the 
Toffoli to flip co just if afii start in the —1 eigenstate 1 1 0} — |11) of X^, and 
the second Hadamard undoes the effect on 6j. 

Apply U bitwise over ab and measure Z CQ . The result Z Co = ±1 is equiv- 
alent to the result that X a b = [j^^j, = ±1, so one has effectively measured 
X ab . To understand this process in detail, expand the initial state of ab in 
eigenstates of the operators X aibi for i — 1, . . . , n: 

E^i x > = ( E + E W> 

x y«>(x)=0 w(x)=l J 

where |x) = \xi---x n ) and each is one of the four eigenstates (xj = 
1,2,3,4) of X a . bi . The right hand sum is rearranged to segregate strings of 
even and odd weight. The weight function w(x) equals the number (mod 2) 
of "4"s occurring in the string x, X{ = 4 corresponding to the —1 eigenstate 
1 10) — 1 1 1 } of X aibi . Using the transversality of C-NOT and the definition 
of u>(x), one finds X ab \x) = (— l^^lx). Thus the sum over strings with 
u>(x) = is the projection onto the +1 eigenspace of X ab , and the sum with 
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w(x) = 1 is the projection onto the —1 eigenspace. It follows that the action 

^|x)aò|0) co = |x)aòHx)) C0 , 

which means measuring Z Co is equivalent to measuring X a b. 

This method of measurement is highly sensitive to errors; just one physical 
bit error can change w(x) for an entire string of bits, making the measurement 
result erroneous. As the block size n gets large, the chances of an even 
number of such errors occurring becomes nearly equal to the chances of an 
odd number occurring. Thus the measurement result telis very little about 
whether a +1 eigenstate or a — 1 eigenstate of X ab has been obtained. This 
little bit of information, however, turns out to be important for preparing 

l^2>. 

As mentioned the above procedure is fault-intolerant, since one physical 
bit phase error may infect Cq and thus spread rampantly throughout the 
block. It can be made fault-tolerant by using an ancilla c, which is not just 
one bit, but a superposition of n physical bits over all even weight strings 
("weight" is now in the sense of counting "l"s). Such a superposition is 
prepared as 

|even) c = (jl R c)j (|0 • • • 0) c + |1 • • • l) c ). 

The gate U ai b lCl will be applied bitwise across abc so that a single error in 
one block can at most spread to one bit in each of the other two blocks. 
Acting bitwise on \xi)\ ) Cl through \x n )\ ) Cn , U will flip a number of bits in 
the initial c state equal (mod 2) to exactly w(x). Thus 

w-i«»>.={j£j->; (3. 5 ) 

Measuring Z Ci bitwise over c with the result \\ i Z Ci = ±1 is now equivalent 
to measuring X a b with the result X a b — ±1. Note that a single phase error 
in the n-bit cat state, or equivalently a bit flip in the sum over even weight 
strings, will change this sum into one over odd weight strings, again alter- 
ing the measurement result while still projecting the state onto one of the 
eigenspaces of X a b. So the measurement procedure is now fault-tolerant, but 
the measurement result is still highly sensitive to single bit errors, giving 
little information about which eigenspace the state | ) a b col·lapses into. 

One can also perform a noisy measurement of the C-PHASE operator 
Z ab . The action of Z ab is just to apply a minus sign if a b are in |11), which is 
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unitarily equivalent to X a b through the basis change Rb- To measure Z a b first 
apply Rb, then measure X a b by the above method, and reapply Rb. These 
procedures may be adapted, by changing the bitwise operation U, to noisy 
measurement of such operators as X ab X cd , Z ab Z cd , and Z ah Zb c . 

3.4 Preparing the Two-Qubit Mixed State 

First prepare two logical qubits ab as (|0) + |1)) 2 and measure Z ab by the 
method given above, making use of an ancilla block c. If the measurement 
result were +1 and all qubits were error-free, one would have prepared exactly 
|-0 2 ) - But this will be changed by errors (i.e. decoherence, gate errors, or 
measurement errors) occurring either to the bits encoding a and b or to those 
of the cat-like ancilla c used in the noisy measurement procedure. In fact c is 
especially vulnerable because it is not protected by any code at all — a single 
bit error anywhere in c can reverse the observed measurement result for Z a b. 

Depending on whether errors are unitary or decoherent, this yields a 
cohererent or incoherent superposition of \ip2) and |11), which will be shown 
to be of the form p(|a3| < 1) in either the unitary or decoherent case, hence 
a candidate for purification. 

Phase errors to the bits of a b cannot be transmitted to c by the above 
procedure, so are irrelevant. Bit flip errors to ab can be transmitted but 
are equivalent to bit flip errors occurring to the bits of c so all errors can be 
effectively regarded as occurring to c alone. Let us first consider the case of 
(uncorrelated) errors purely decoherent in the Pauli basis a m , so that each 
qubit Cj suffers no error, a phase error, a bit error, or both errors — each with 
some fixed classical probability. 

Phase errors in c can affect only the relative sign of terms in |even) c and 
|odd) c of ()3.5|) . hence are extinguished once the Z Ci measurements are made. 
Depending on whether c is attacked by an even or odd number of bit errors, 
the measured eigenvalue of Z a b will be inferred either rightly or wrongly from 
the outcome of the Z c . measurements. So given the result \\ i Z Ci = +1, an 
even number of bits errors will yield |-0 2 ) as desired; however, an odd number 
will yield |11) unbeknownst to us. 

If each q suffers a bit error with probability pi, the difference between 
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the chances of an even number of bit errors and of an odd number is 

' n \ 

n e (- -Pi) 1 -* = ri( i - 2po- (3.6) 



=1 Kí=0,1/ 



Given that these two probabilities sum to 1, this implies the preparation 
procedure will yield not exactly |^ 2 ), but the state p(0, 0, a 3 ) with 

where the last expression holds for large n. It thus appears that one cannot 
teli whether or not |ct 3 | < 1 for a given ancilla block c if even a few of its 
bits might have Pi > 1/2. This is true even though current codes themselves 
are completely robust to these "defective" bits so long as their distribution 
is suitably uncorrelated and infreqüent at the level of the code's threshold 
error rate. Still what has to be considered for the purification process is not 
a single ancilla block c, giving rise to one iV^Tike state, but a sequence of 
such blocks, each with its own set of defective bits and conseqüent value of 
a 3 — c4 m \ where m runs from 1 to 2 N , the number of |^2)-like states input 
to the purification process. 

The fidelity in purifying j^) is that given by (J3.2|) with ot\ replaced by 



n 



c4 m) « e" 2Ar+1 ( n » {1 ~ 2pi) ). 



where (• ■ ■) is an average over the ensemble of c blocks. Assuming errors 
are uncorrelated between different c blocks, the average factorizes and the 
purification fidelity is 

i_i e -2 jv + i n t (i-2<Pi» - 

Here only the ensemble averaged bit flip error rates appear. Assuming the 
locations of defective qubits are uncorrelated between different c blocks, 
their pi > 1/2 contributions are simply weighted out in the average. This 
means infreqüent defective bits no longer pose a problem, owing to the 
distributed nature of the purification process. Defining an average error 
rate p = (l/ n )Ei(Pi)j ^ ne above product is approximately e~ 2pn . In order 
that the resulting fidelity be comparable to that of the code itself, which is 
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~ 1 — exp(— Kn 13 ) for some power (3 and constant K, the number of physical 
qubits used in purification is roughly 

n2 N ~ n 1+t3 e 2pn , 

which puts a limit on the block size n of the code being used, since the 
number of qubits used in purification should not grow exponentially with 
block size. Thus n cannot be larger than 

1 1 

n ~ - log -. (3.7) 
p p 

The opposite case of purely unitary errors is the same in its result. Here, 
errors comprise a set of unitary operators which act on the Cj respectively. 
Each such operator can be written in the Pauli basis and put in the form 

Ej = (Ajl + iB j( j z ) + ia x (Cjl + iD j( j z ), 

where Aj, Bj, Cj, Dj are real. Assuming low error rates, so Yli |A| EL 1-^*1 
and likewise with d and D i in place of Bi, one can show that these errors 
take c from its prepared state |even) to 

J^J Eleven) = |even) + itan(S c )|odd), 

i 

where Se ~ ^ tan _1 (Cj/yii). This leads to preparation of the state \ip2) + 
itan(£c)|ll) in place of \4>2)- But this state is precisely p(cüí) with oi\.2 = 
itan(E c ) and a 3 = — tan 2 (Sc), which can be purified to l^) if tan 2 (Sc) < 1. 
For large n, Ec will be very sensitive to the error amplitudes Cj, and in 
practice one would have no way of knowing whether taxíÇEc) < 1 or not. 
This is the same problem noted above in the case of pure decoherence, and 
it too disappears when one realizes that the purification input is not a single 
state l^) + itan(Sc)|ll) but an ensemble of such states each generated by 
a different set of blocks abc. The purification fidelity is now given by (|3.2|) 
with aj} replaced by 

J]tan 2 (S c )^e 2ÍV+1 ^ tanS ^. 

m=l 
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where (• ■ ■) again averages over the ensemble of c blocks. Using this, the 
definition of Ec*, and expanding the logarithm gives (log(tan £<?)) as 



The factorization follows assuming independence of errors between different 
qubits in each c block. Now expand the exponential in a power series. If 
the error distributions are all such that the two bit flip amplitudes Gj and 
—Ci are equally likely to oceur, the expectation vàlues of the odd terms in 
the power series vanish. (If the distributions are otherwise, one might use 
the computational basis {|0), — 11)} instead of {|0), |1)} for the qubits in half 
the c blocks, so that the same physical error would correspond to the bit flip 
amplitude — Cj as often as it would to Cj.) The even terms may then be 
resummed and the expectation value in (|3.8|) becomes 



where (pi) = (Cf/A?) to lowest order in Cí/Aí, hence (pi) can be taken 
roughly as a bit flip error rate. For small k the cosine functions will be close 
to 1, and the product over i will be greatest. As k gets large, the cosines will 
sample their full range and the product will be highly suppressed. Therefore 
the cosine above can be replaced by exp(— 2{2k + l) 2 (pi)), as if k were always 
small, giving the main contribution to the sum: 



where p is again the average of (pi) over i = 1, . . . , n; this is basically the 
same result as obtained for purely decoherent errors. Thus again ()3.7|) gives 
the largest allowed block size in the regime where the resources needed for 
purification scale polynomially with block size. 

3.5 Progressive Concatenat ion 

In case higher fidelity is desired of the code than ()3.7j) allows, the above 
methods by themselves are insumeient and one must resort to concatenation. 




(3.8) 
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However, in conjunction with these methods, an unconventional, exponen- 
tially weaker form of concatenation can be used. A usual concatenated code 
is self- similar, the same abstract code (e.g. the 7-qubit code) being used 
at each level in its recursion. Here one is free to increase the block size at 
each level, as long as (|3.7J1 is satisfied level-by- level. In these "progressive" 
concatenated codes, many fewer levels are necessary given a desired fidelity 
1 - e. 

In particular, for one error correction algorithm [22] in the context of 
lattice codes, some reasonable parameters are p = p c /10 < 1CT 3 , so that n = 
1000 is acceptable by ()3.7|) . Here e ~ {p/Pc) nl3 where (3 = log 9 2 rs .315, which 
gives e ~ 10~ 9 . So, if the desired fideltiy is below 1 — 10~ 9 , no concatenation 
is necessary. Otherwise, one can begin concatenating. 

Consider a single concatenation of a chosen code. Physical qubits with 
error rate €o = p are arranged in code blocks of size and these blocks 
are themselves arranged in blocks of size n 2 . The effective error rate at this 
higher level is just the failure rate of blocks at the lower level: 

ex ~ (e /p c ) Kn ", (3.9) 

where p c , K, and j3 come from details of the code being concatenated. The 
code whole has failure rate 

e 2 ~ {e\/p c ) K <. (3.10) 

where includes the effect of storage errors described by e\ and also gate 
errors associated with the operations necessary to perform error correction. 
Thus e* will have the same form as e% in (|3.9|) but with eo = p replaced by a 
physical qubit error rate including the effect of these additional errors. In 
other words one must deal not only with a storage error threshold but also 
with a gate error threshold associated with the computations necessary for 
error correction. Of course, if one intends to use the logical qubits stored by 
the code for actual computations, this would be necessary anyway. 

Assuming the effective error rate is still below threshold, a single con- 
catenation of the code in the above example gives 

„ = (wy , V'*~io-° 

where the vàlues K — 1, /3 = log 9 2, and = p c /5 have been used. The 
block sizes n\ = 1000 and n 2 = 2 ■ 10 7 were chosen to be consistent with 
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ni ~ (l/e2_i) log(l/e2_i), i-e. the condition (jH.7j) applied at each level. 
Thus, as long as one does not require a fidelity better than 1 — 10~ 830 , a 
single concatenation is sufíicient given the above parameters. 

Because the block sizes Ul may increase so rapidly, the number N of levels 
necessary for a desired fidelity 1 — e scales differently than in usual concate- 
nation, in which iV ~ log(log(l/e)) = log^(l/e). For these progressive 
concatenated codes, N is determined self-consistently by iV ~ log^ ( 1 /e) . 
One might wonder about the asymptotic behavior of thresholds and fideli- 
ties as iV — > oo, taking into account the reciprocal effects between progres- 
sive concatenation and purification; however, this seems irrelevant given the 
smallness of e already at iV = 2. 
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Figure 3.1: Combining p states to prepare pn (above, N = 3). 
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Figure 3.2: The operation U on physical qubits òj cq. 
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Chapter 4 

Path Integrals and Beable 
Trajectòries 

The strategy employed in the last chapter to achieve a desired quantum state 
(in our case a certain two or three-qubit entangled state) by an iterative 
process of purification appears somewhat narrow in its application in the 
context of certain quantum error correcting codes. However it turns out this 
underlying idea can be translated into classical terms and proves valuable in 
defming a purely classical algorithm for the simulation of quantum systems — 
meaning, that the goal of the algorithm will be to simulate quantum systems 
but its implementat ion will be entirely concerned with classical computations 
on a normal computer. 

As we have discussed, numerical simulation of quantum systems by classi- 
cal algorithms is limited by the fact that the dimension of a system's Hilbert 
space grows exponentially with the number of physical degrees of freedom. 
One general approach to this problem is to obtain a correspondence between 
the desired quantum system and a statistical ensemble of classical systems 
that are easier to simulate. Green function Monte Cario [21] and coherent 
state representations in quantum òptics |2Sj both derive Fokker-Planck equa- 
tions, which can be realized through stochastic trajectòries in an associated 
classical configuration space. Another category of methods well known in 
condensed matter and particle theory is that of path integrals with pseudo- 
dynamical importance sampling j2H]. It turns out these configuration space 
methods are closely related to well known beable 1 models of quantum me- 
chanics. 

1 John Bell used the term "beables" rather than the misnomer "hidden variables" to 
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Understanding these relationships offers another perspective on existing 
computational methods and also points the way to new such methods. In 
particular Monte Cario, path integral, and coherent state representation ap- 
proaches have mainly been applied by exploiting their connection to diffusion 
processes, with the result that calculations have been limited to ground state 
or thermal properties, or else very restricted classes of Hamiltonians. In 
the case of coherent state methods, like the P or positive P representation, 
Hamiltonians with higher than quadratic interaction terms give rise to mas- 
ter equations that cannot be cast as stochastic differential equations over 
an associated phase space. In the case of Monte Cario and path integral 
methods, a connection to diffusion processes can only be made by going to 
imaginary time. 

However, the beable methods to be presented here are naturally formu- 
lated in real time and are not restricted to narrow classes of Hamiltonians. 
Whatever their computational efhcacy, they are thus at least well-defmed, 
general methods for studying dynamics. 



4.1 Langevin Method and Nelson's Mechan- 
ics 

The thermal expectation value of an operator O is often given in the form of 
an imaginary time path integral: 

(O) = J VxO{x)e~ s ^ x) (4.1) 

where S^(x) is so me Euclidean action over the degrees of freedom x = 
(x 1 , . . . , x d ). To compute this path integral we need a method to select paths 
over which the action may be sampled according to the weig ht e - SE(x) . The 
methods to be considered here generate paths as the solutions of initial value 
problems in the classical configuration space {x} of the system. 

In the Langevin or "stochastic quantization" approach j2E]EZl; (O) is 
computed as a long-time average over a path x(r) generated by the stochastic 
differential equation 

dx i = -diS E dT + V2dW i (4.2) 

distinguish them from observables in quantum theory. 
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where h — 1, di = d/dx l , and dW 1 is a Wiener process, i.e. an independent 
Gaussian random variable at each r with mean (dW % ) = and variance 
((dW 1 ) 2 ) = dr. Here, r is not a time (real or imaginary) but an auxil- 
iary variable introduced to parameterize paths. And one can show that an 
ensemble of trajectòries evolving by (|4.2|) samples e -515 ^ in the limit r — > oo 
Nevertheless, observe the similarity between the Langevin equation (|4.2j) 
and the stochastic process involved in the generalized Nelson hidden variable 
theory [2H] for partides of mass m = 1 under a potential V(x): 

dx l = (dtS + a^pj dr + y/àdW* (4.3) 

where dW 1 is also a Wiener process, and a > is a free parameter. Note 
that Nelson's theory [22] results if a — 1 and Bohm's deterministic theory 
|3*Üj if a = 0. While ()4.3|) is normally defined in real time, we need the 
imaginary time analog for comparison with the path integral (I4.1J1 . This is 
easily obtained by taking ip(x, r) = Re s as a solution of the imaginary time 
(r = —it) Schrodinger equation 

which will hold if we evolve R and S by 

^ + ^^(i? 2 ^)=0 (4.5) 

i 



where 

i 

is Bohm's quantum potential. 

The crux of this kind of hidden variable theory is that an ensemble of 
partides evolving by ()4.3|) will be distributed as R(x,t) 2 at time r if they 
are initially distributed as R(x, 0) 2 at time 0. This follows from the fact that 
the Fokker-Planck equation 



R 2 



(4.8) 
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for the distribution P(x, r) generated by (|4.Hjl becomes identical to the con- 
servation equation (|4.5jl if we put P = R 2 . Note that (|4.5jl and (|4.8jl are 
unchanged from the case of real time. 

The effect of going to imaginary time is just to invert V(x) and V q (x) 
in (|4.fí|) . which resembles the classical Hamilton- Jacobi equation for the Eu- 
clidean action Se = / Hdr, where H is the classical Hamiltonian. In fact, 
(14. fíj) implies that S — J (H + V^)dr, where the integral is taken along a tra- 
jectory defined by the "current velocity" v % = dS/dx l , which means a Bohm 
trajectory. 

Therefore, choosing a = 2 in (|4.3|) gives the time-reverse of the Langevin 
equation (|4.2|) except for extra V'-dependent terms in the drift. The result 
is that, while the Langevin trajectòries sample c~ Se ^ in the limit r — > oo, 
the Nelson trajectòries generate a particle distribution that samples R 2 itself 
for all r > 0. This suggests that we look for a Langevin-like method of 
importance sampling in which the drift is somehow guided by information 
we might have about if>(x, r). 



4.2 Guided Random Walks 

Indeed such a method already exists in the literature. It relies on a trial 
wavefunction ip to generate trajectòries useful for the evaluation of observ- 
ables, and these trajectòries become exactly Nelson trajectòries as ip becomes 
more accurate. 

The ground state expectation value of some observable O is given by 

(0) = lim <^;;XIT 0> <«> 

- - (V>o|e lTtl \ÍJo) 



where |t/>o) is our trial evaluated at r = 0. The exponential factors in the 
numerator serve to dampen out any excited state components in ipo as r — > 
oo. Identifying 

(ip \e~ TH \x) = Ï>{x,t) 

as a corresponding trial wavefunction for the imaginary time-reverse Schrodinger 
equation, (|4.4jl with r — > — r, and breaking the other e~ TÍÏ factor into iV 
pieces e _eiï • • -e~ eH , the numerator in ()4.9|) can be written as a path inte- 
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gral: 



. N-l 

/ díxo--- dx N $(x , 0)i/)(x o , 0)O(x N ) U e (x p+1 , x p ) (4.10) 

p=0 



where x r , 



[x 



o 



UJx 



— \Xr, 



-eH\ 



Xp) 



lf){x p+1 ,T p+1 ) 
ïj(Xp,T p ) 



(4.11) 



, x^) denotes a single classical configuration of the system 



u p v"pi • • • i ~pj 

at time r p = pe, and we have assumed that O is a local observable with 
matrix elements (x\0\x') = 8{x — x')0(x). The ratio of ^ terms in (j4.11|) 
produces a sequence of cancellations that allows us to have r/i evaluated at 
r = in (|4.1()j) . To lowest order in e, it can be shown [HI] that 
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where the drift and residual action are given by 



Dl 
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(4.12) 



(4.13) 



A5„ 



In the method of guided random walks the path integral (|4.1üj) is 
sampled by stochastic trajectòries over configuration space {x}. In particu- 
lar, the Gaussian factor in (|4.12|) is interpreted as the probability to go from 
Xp to x p+ i over the p-th time step, and e~^p ASp gives the "score" of the path 
(x , . . . , xjy) in the stochastic average. 

Now, if ip(x, t) were chosen to exactly satisfy the time-reverse Schrodinger 
equation, we see that AS P would vanish and the trajectòries would optimally 
sample (|4.10p . But then, as e — > 0, the jump probability in ()4.12|) and the 
drift (J4.13)) exactly match those of Nelson's theory, (|4.3|) with a = 1. In 
other words, the guided random walk method of importance sampling can 
be seen as an attempt to generate Nelson trajectòries (for the time-reverse 
problem) with a trial wavefunction defining the drift. 

The fact that the generalized Nelson theory (|4.3J) matches the quantum 
evolution for any a > suggests a new one-parameter family of guided 
random walk algorithms for path integral evaluation, with a controlling the 
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strengths of both the diffusion and the drift. This family is similar to the fam- 
ily of hybrid Langevin molecular dynamics algorithms introduced by Kogut 
and Duane [221; with the pure molecular dynamics algorithm in the Kogut 
family corresponding to Bohm trajectòries (a = 0) in our family. While both 
of these last algorithms are deterministic, there does not appear to be any 
quantitative relationship between them. 

4.3 Iterated Bohm Trajectòries 

The effectiveness of importance sampling in a guided random walk algorithm 
depends on the accuracy of the trial wavefunction. It would therefore be de- 
sirable if one could use information gained from the trajectòries themselves 
to improve the trial and generate new trajectòries. Iterating the process may 
then produce a sequence (/0 , ^ > ■ ■ ■) °f successively better trial wavefunc- 
tions. 

The following method in the case of pure Bohm trajectòries is due to 
Goldstein [HS]. Since Bohm's theory is equally applicable to real time t, let 
us return to that case. Bohm's equations of motion, obtained by using ()4.3|) 
with a = and differentiating the real time version of (|4.6|) are 



with the minus sign replaced by a plus sign in the imaginary time case of 
(|4.4j) . Here the Bohm particle velocity v % is identified with dSjdx 1 as usual 
in Hamilton- Jacobi theory, except that this S, from (|4.6|) . includes the effect 
of V^j in addition to the classical potential V. Also, here, 



defines the the convective or "along-the-trajectory" derivative. 

Writing ip(°^ = e l5<0) , we evolve trajectòries according to (j4.14j) with 
R(°ï used to the calculate in ()4.7|) . Were tp^ an exact solution, ()4.5|) and 
(|4.6j) would ensure that these trajectòries generate a particle distribution 
equal to (R^) 2 . However, with an inexact trial wavefunction, the actual 
distribution will differ from (R^) 2 . We can thus use this actual distribution 
to define the next iterate R^\ which can then be used to obtain a new V q , 




_d_ 



[V(x) + V q (x,t)} 



(4.14) 
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new trajectòries from f)4. 14j) . and a new phase if desired. Iterating this 
procedure yields a sequence (ip^ ',^ 1 ', . . .) that has the exact solution ip as 
a fixed point; however, convergence is not guaranteed for any fixed number 
of trajectòries being propagated. 

There is also another technical issue. The equations of motion (j4.14J) will 
generate unique non-crossing trajectòries when is obtained exactly from 
ip(x,t), but no such guarantee exists for an inexact iterate ip( l \ Viewing 
ÍI4.14|) . with used to calculate V q , as defining a map from configuration 
space at time to itself at time t, one thus finds that the map will generically 
be many-to-one. In effect, for t large enough, ip^ (x, t) would be a multivalued 
function of x. Either dealing with this multivalued-ness or eliminating it by 
fiat poses a significant problem in defining the algorithm. 

Goldstein's iterative algorithm can also be compared to moving grid meth- 
ods for quantum dynamics that rely on Bohm trajectòries to define the grid 
|34] |55j . These methods propagate a swarm of Bohm trajectòries similar 
to the way a fluid dynamics algorithm propagates a large number of fluid 
elements, and therefore they do not require any trial wavefunction to get 
started. They are PDE propagator algorithms that are truly local in time 
and (configuration) space. 



4.4 Discrete Beables 

Let us now consider methods posed explicitly for a finite dimensional Hilbert 
space spanned by the basis states \n) with n = 1, . . . , N and with Schrodinger 
equation 

j t \m) = -mm)- (4.15) 

Originally for the purpose of attacking conceptual problems in quantum 
field theory, John Bell found an analog of Bohm's model in which trajectò- 
ries are generated by "beables" (i.e. classical-like partides in state space) 
stochastically jumping between states connected by non-zero Hamiltonian 
matrix elements |3ü]. In place of ([4.3)1 . Bell takes the probability for a tra- 
jectory to jump from state m to a distinct state n, sometime in the interval 
(í, t + e), as 

T r*w_ / 2Re{z nm (í)}e if Re{z nm (t)} > . . 

lnm[t)e \0 ifRe{z nm (í)}<0 ^ iD J 
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where we define 

z nm (t) = -iH nm ^^ . (4.17) 

and ip n = (n\ip), etc. To ensure normalization, the probability for a trajectory 
to stay at m is thus given by 1 — Tnm{t)t, where the primed sum excludes 
the diagonal term n = m. Note the similarity between (j4.17|) and (j4.11j) . 
But here jumping from m to n is allowed only if there is a non-zero matrix 
element H nm for the transition. 
From f)4.17|) . we find 

1-01 2 

Re{z nm } = -Re{z mn }-- — ^ , (4-18) 

which implies that either T nm = or T mn = 0. Together with (j4.15J) this 
gives 

^|VV| 2 = ^ 2 ^{ Z nm\^m\ 2 } = ^2(T nm \^m\ 2 ~ T mn \í/j n \ 2 ) (4.19) 
m m 

in analogy with ()4.5|) . Note that the T nm term contributes when ~Re{z nm } > 
0, and the T mn term contributes when Re{z nm } < 0. 

Now consider the distribution P n {t) of beables in state space, jumping in 
accordance with (J4.1b|) . The analog of the Fokker-Planck equation (J4.8)) is 

-P n = J2( T nmPm ~ T mn P n ) . (4.20) 
m 

And this becomes identical to ()4.19|) if we put P n = |^ n | 2 . Thus, provided 
-fn(O) = iV'n (0) | 2 , we are guaranteed P n (t) = |-^ n (í)| 2 for all t > 0, just as in 
the continuous case. 



4.5 A New Guided Random Walk 

Can we find a stochastic method for path integral evaluation resembling Bell 
trajectòries? Consider the real-time analog of (J4.10)) for a finite dimensional 
Hilbert space, so that the continuous variables replaced by discrete 

ones n p . We can make a connection to Bell's theory by expanding e~ leH 
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to first order in e and evaluating matrix elements. Assuming a diagonal 
observable (n\0\m) = 6 nm O n , we get 

(0)t = <M*)*tM°) O nN Y[(-ieH np+inp ) (4.21) 

V(N) peJ 

where V(N) = (n , . . . ,n N ) specifies a path in which H np+inp 7^ for each 
n p+ i 7^ n p . The jump set is defined as J[V] = {p | 7^ n p }. Now we can 
use the same trick of introducing i[) ratios to write 



^n N (ty i[(-ÍH np+inp ) = V>no(0)* II Z n p+1 n p (tp) ]J ^ ^ (4.22) 
PGJ pGJ p£J ^P^P) 



with z nm as in ()4.17|) . 

To realize ()4.21|) as a weighted average over stochastic processes, we need 
to choose the jump rates T nm so that the probability 

Prob[P(iV)] = n t T n P+l n p H(l - eY! n Tnn p ) (4.23) 
peJ p^j 

of realizing a given path V(N) resembles the double product in (|4.22p as 
closely as possible. ls a sum excluding the diagonal term {n = n p here). 
Consider the choice would be (n p+ i 7^ n p ): 

T np+irip — \z np+inp \a p (4.24) 

where the z nm are given by ()4.17|) with our trial wavefunction ip, and the a p 
are to be determined. The remaining phase factors in ()4.22|) not assimilated 
into ()4.23|) will then constitute the complex "score" of the path V(N) in 
the stochastic average. That is, the stochastic average that computes í)4.21|) 
is obtained by propagating trajectòries according to (|4.24|) . calculating the 
score of each trajectory, and adding up all the scores. 

One can show that apart from endpoint contributions around p = 0, N, 
the double products are matched in absolute value with error 0(iVe 2 ) if the 
a p satisfy 

E(^|^-^i^i-| lo gi^^)i) =° ( 4 - 25 ) 

where j(p) is the largest member of J less than p, and N p is the number of 
time steps between the j(p)-th step and the next jump. Unfortunately we 
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cannot make the sum vanish term by term. When we need to choose a p at 
the p-th time step, we do not yet know the value of N p for the trajectory, 
because we do not yet know when the next jump will occur. It is therefore 
necessary to choose a p dynamically. At each time step p we can evaluate the 
sum ÍI4.25|) up to p and choose a p to stabilize it back toward 0. 

In particular, if we change a p only just after a jump, so that a p = cij(p)+i 
for all p, we can explicitly calculate the expectation value of the sum (|4.25|) 
with respect to variations in N p produced by the the jump probabilities 
(14.24}) . Setting this expectation value to zero, term by term, and solving 
for a p yields a critical value such that choosing a p above (below) the critical 
value will tend over time to increase (decrease) the sum. This enables us to 
stabilize the sum and achieve a correspondence between, on the one hand, a 
stochastic average over trajectòries defined by ()4.24|) and, on the other hand, 
the path sum (gjZTJ). 

We have thus obtained a stochastic algorithm for evaluation of the dis- 
crete path integral that resembles Bell's discrete hidden variable theory. As 
opposed to the Standard random walk algorithms, this one involves trajec- 
tòries that jump from one site to another only if there is a non-zero matrix 
element for the transition. In particular, with a spatial basis \n) and a local 
Hamiltonian, the path space is reduced considerably. The price is that the 
trajectory scores become complex in the case of imaginary time as well as in 
real time. 

If the matrix elements H nm are bounded and independent of t, a genèric 
estimate may be obtained for the importance of different classes of paths. 
Going back to (|4.21|) and performing the sum over paths comprising the 
same sequence of jumps, differing only by when each jump occurs, we have 

{0) t =Y,O nM r M JCmM (4-26) 

Aí=0 ^ ' 

M-\ 

P(M) P=0 

(m is a kind of partition function summing over all paths P(M) = (n , . . . , n M ) 
without stops: n p+ i ^ n p for all p = 0, . . . , M — 1. The binomial coefficient 
counts the number of paths V(N) from (j4.21j) that correspond to one path 
P(M) in KM . 



(4.27) 
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Using Stirling's formula, 1)4.26)1 may be approximated as 

(0) t = Y,On M e m) (4.28) 

M 

S(M) = Mlog í — J - — + log Cm • (4.29) 

Now, log \ Cm\ ~ Mloge can be interpolated into a smooth function of M for 
any finite e. However, from (J4.27)) we can write 



Cm+i = Yl II (- ieH n P+ m P 

P{M) P=0 



M-1 r , * 



nn M . 

i 

n M 



(4.30) 



which means that Çm+i is obtained from (m by multiplying each term in 
the path sum by a different factor with magnitude O(e) and unconstrained 
phase. This suggests that arg(C AÍ+1 ) will differ randomly from arg(^ M ), so 
that Im(logÇjví) — ar g(CA/) can n °t be interpolated into a smooth function of 
M. Thus we expect RelS"}, not ImlS 1 }, to control which terms are dominant 
in ijOty . 

To find these terms we need to evaluate 

- r Re{S} = log — - — + — log |Cm| • 



dM °\My dM 

Care must be taken in defining the derivative on the right, since log |(m| 
diverges as e — > 0. We use the finite difference 

log ICa-/| ~ log|Ciif+i| — log|Cjwrl 



dM 

together with ()4.3()j) and the definition (|4.17|l to get 

d 



dM 



log Km | ~ ^g(e\(J2' n z n n M ) M \) 



where (■ ■ -)m averages over all paths V(M) with complex weight as in (|4.27|) . 

Solving the saddle point equation dS/ dM = will determine which path 
lengths M are most important for computing ([4.26)1 . This equation has the 
form 

Ioga; + 2x + A = 
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where A —>■ oo as e — > 0, and one can verify that 

is an explícit solution. When A is large, x = e~ A is a very good approxima- 
tion, which gives the saddle point equation 

M = t\{T! n z nnM ) M \ , (4.31) 

whose solution we denote M = M*. Note that the right side of ()4.31|) itself 
depends on M. However, assuming H nm are bounded, ^2 n {z n n M )M wm be 
bounded, and the solution M* will not be very sensitive to its M-dependence. 
To determine the spread of dominant terms around M = M* , we evaluate 

by putting 

d 2 

log | Cm I ~ log | Caí+2 I - 2 log I Ca/+i I + log I Cm I 



dM 2 



log 



ICm+2/ÇaJ | 

ICm + i/Ca./I 2 



Separating out two extra matrix elements of H in Caí+2 as we had separated 
out one matrix element in (|4.30|) . we obtain 

log | Cikf | ~ log — — |2 JUI (4.33) 



dM 2 




which is independent of e. If this term dominates in ()4.32j) . it will control 
the spread of terms around M = M* that contribute to ()4.28jl . Otherwise, 
1/M, will set the scale of (IQ2l . and the spread will be AM ~ y/W*. 

Consider a simple limiting case where the diagonal matrix elements H nn = 
E are all equal, and we choose \ip) in (|4.2fi|) such that H\ip) = E\ip). Now it 
is easy to verify that Yl!n z nm = E — E for any m, so that í)4.31|) gives = 
\E - E \t. Also, dOH) vanishes, leaving the spread as AM ~ (\E - E Q \t) 1 / 2 . 
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4.6 Iterated Bell Trajectòries 

While we have succeeded in re-analyzing the path sum in a way motivated 
by Bell's jump rule ()4.16|) . the new rule ()4.24|) is somewhat cumbersome to 
implement and gives rise to complex path scores. A more natural approach 
is found by going the other way: re-analyzing Bell's jump rule in a way 
motivated by path integral methods. 

Consider the observable O = \n)(n\, so that (0) t = \ip n (t)\ 2 . (j4.21J) now 
gives the path integral representation of a kind of Green function for |^ n (í)| 2 
rather than tp n (t). The classical analog of this in probability theory is just 
the equation 

P n (t)= Pn (0)PTob[V(N)] (4.34) 

V(N-l) 

where the sum is taken over all paths V(N) with = n fixed. This gives 
the expectation value of a classical state function O as 

(0) t = On N PnM Prob[P(iV)] . (4.35) 

V{N) 

What makes this merely an analogy to the quantum case is that in the latter 
the amplitudes involved in the path sum are generally complex, seemingly a 
necessity to achieve the affects of quantum interference. But Bell has shown 
us that we need not be content with just an analogy. The integral formulation 
of Bell's model based on (J4.23j) re-expresses the quantum path sum with a 
real, strictly positive path amplitude, exactly as in (|4.34|) and (|4.35j) . This 
is accomplished simply by choosing the T nm according to (j4.16J) . with the 
result that P n (t) = \ip n (t)\ 2 for all í > if we set P n (0) = |^„(0)| 2 . Quantum 
interference effects, here, are not manifested by the contributions from certain 
paths cancelling those from other paths, but rather by the propensity of 
beable trajectòries to follow or not to follow certain paths in the first place. 
The sum over histories is manifestly undemocratic here. 

To implement this as a stochastic algorithm we need to use a trial wave- 
function for ip in f)4 . 1 7j) . The beable distribution P n (t) of the resulting tra- 
jectòries then serves to compute |-0 n (í)| 2 and (0) t as stochastic averages for 
dUSH) and (ETÏÏïïj) respectively. 

In analogy with Goldstein's algorithm, we might then attempt to use these 
trajectòries to generate a new trial ip^\ which could be used to generate new 
trajectòries, etc. The problem is that, while in the continuous case knowledge 
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is enough to compute new trajectòries via (j4.14j) . here we 
need Rej^} to compute new trajectòries via flUHJ). But Re{z®} depends 
on the phase as well as the amplitude of ip^ . 

What must be done is to derive an evolution equation for Re{z nm } from 
ÍI4.15|) in the discrete case, just as Bohm obtained (j4.14|) from the continuous 
Schrodinger equation. Using (J4.15j) to differentiate the defmition (|4.17j) gives 

= £<*--*"> (4.36) 

k 

which contains the same information as ()4.15j) itself. Notice that the (time- 
independent) Hamiltonian does not enter as a parameter in this equation, but 
only in the initial conditions when the z nm (0) are given in terms of the ip n (0). 
An additional dH/dt term would appear in ()4.36|) for the time-dependent 
case. 

Now consider propagating ()4.36j) along a trajectory V at the i-th stage 
of iteration. One scheme would be to evolve Znl +1 n p along V by evaluating 
the right hand side of (J4.36)) with z^ 1 '. But we might as well take the z nm 
factor on the right hand side as the current iterate z^\ since we can do so 
with the information obtained just from propagating V. Thus we propagate 

d_ 

dl 



ziX, Z^ln ^ ^X Z km Z kn ) (4.37) 



k 



along beable trajectòries generated by the jump rule Tnm = 2Re{zn m 1 ^} 
(Re{z} > 0). An alternative scheme can be obtained by defining 

Z m ee Yl z ™ = E - ÏH ^T% = "I l0 ^*rn(t) (4-38) 
and summing 1)4.36)1 over n, which gives 

^ ^ m ^ ^ ^nm ■ (4.39) 

n 

Using 1)4.38)1 in 1)4.36)1 . we obtain the time-dependence z nm oc exp[f(Z m — 
Z n )dt] and can thus eliminate the z nm from (j4.39j) . After extracting the 
n = m term from the sum, we have 



(0) exp (j [Z m (s) - Z n (s)] dsj 



d z 

dt 



(4.40) 
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as a replacement for (|4.Hfij) . This can be cast for an iteration scheme by 
taking the Z's in the sum as (i — l)-th iterates, hence known functions in 
the above equation, and the other Z's as i-th iterates. It would also be 
possible to keep Z m in the sum as an i-th iterate, but this would result in a 
complicated integro-differential equation. 

Having propagated the i-th ensemble of trajectòries and calculated Z^ % >— 
in the former scheme — along each of them, we still have to determine 
for the sites in state space that neither contain or neighbor any members of 
the beable ensemble at each time step. The natural choice in the present 
context of a discrete state space is simply to retain the Z^ l ~ l > vàlues at these 
sites. Since trajectòries are attracted to sites with large IV' | 2 > this will cause 
the algorithm to concentrate on these sites and not on the low probability 
ones. 

In essence, this scheme is constructed so that the iterate Z^~^ guides 
the i-th set of trajectòries, which then serve as an importance sampling 
prescription to calculate the next iterate Z^\ In this way an initial trial Z(°>, 
or equivalently may be iteratively improved as the algorithm (hopefully) 
converges in computing the dynamical expectation value (0) t for a given 
observable O. 

The effectiveness of this method of importance sampling, here in the 
context of sampling entire dynamical evolutions as opposed to simply ground 
state distributions, is a delicate matter above and beyond qüestions of the 
stability and accuracy of the iteration scheme itself. 

In implementing this algorithm, one must be cautious of potential diver- 
gences of the Z m , since this will generally occur if ip m = 0. Fortunately, we 
will not have to control these divergences in a very sensitive manner when 
solving the dynamical equations (|4.4Uj) themselves because we expect succes- 
sive iterates to iron out any initial imperfections. Our goal is then merely 
to prevent catastrophic events that might corrupt computations occuring 
on neighboring sites. We can accomplish this much with a simple implicit 
method (supressing the site index): 

Z{t + At) = Z{t) + Z'{t)At + 0(Aí 2 ) = ^Mr~ + °( Aí2 ) • 

1 z(t) ^ 

In fact, for better accuracy, we adopt a second order implicit method based 
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on the relation: 

Z(t + At) = P^- r + OÍAt 3 ) . 

1 Z(t) * \^ Z(t) 2 2 Z(t) ) ^ 

The form of the denominator has been chosen to reproduce the appropriate 
Taylor expansion up to second order and requires a computation of Z"(t), 
which can be accomplished by differentiating (|4.4U|) once to get 



— 7. 1" Z n (yZ n 

dt 



To implement this in our iteration scheme, we must remember that terms in 
the sum are to be taken as (i — l)-th iterates and terms outside the sum as 
i-th iterates. 

In order to propagate (|4.40J1 along trajectòries we need to compute both 
Z and f Zdt at each step. The latter must also be done to second order by 
expanding the integrand in X* + * Z(s)ds around s = t: 

/t+At 
[Z(t) + Z'(t)(s -t) + ---]ds = Z(t)At + \Z\t) At 2 + 0(Aí 3 ) . 

There is nothing in principle to prohibit an implicit third or higher order 
method for propagating Z and f Zdt along these lines; each higher order 
would simply require another differentiation of ([4.40)1 . 

The distinguishing characteristic of an iterative algorithm such as the one 
presented here is that while the solution Z at some time t will be directly 
fixed in any one iteration by Z at times just prior to t, successive iterates 
will bring in information from much further back before t. The global nature 
of this process is paid for by the computational cost of propagating a whole 
new time-development for each iterate. 

However, it is possible to taylor this trade-off by subdividing the total 
simulation interval (0, T) into windows (t w -x,t w ) each of some fixed duration 
Atyy so that iteration will proceed in each window until convergence is ob- 
tained before moving on to the next window. Thus, in the first window, the 
initial conditions Z(t = 0), together with some initial guess for Z(0 < t < ti), 
will be used to begin iteration, and after convergence is reached the solution 
endpoint Z{t\) will provide the initial condition for the next sequence of iter- 
ations over t 2 ), etc. We will determine that convergence has been reached 
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by the i-th iterate in a given window (t w -\,t w ), if the correlation coeficient 
of the two data sets {Zm (t w )} and {Zm(t w )} exceeds some threshold value 
very close to 1. 

As with the guided random walk methds, we also need some kind of trial 
wavefunction — here, as the initial iterate ^°'(t) or Z^°\t) for each window 
(t w -i,t w ). One may obtain this ij)( ' from a completely separate method, 
which would likely rely on considerations more specific to the Hamiltonian in 
question. However, in order to concentrate on the iterative algorithm alone, 
we will use a rather simple trial: 



where / indicates the final iteration of the previous window. The benefit 
of this is to make the algorithm totally self-contained. The costs are an 
increase in the number of iterations necessary for convergence and, most 
probably, decreases in the stability of the algorithm as T grows. 

All that remains is to specify a physical system and initial condition, and 
then we can investigate the performance of this iterative trajectory algorithm 
as a function of its various paramters. We will consider a ld ferromagentic 
lattice of spins cr n with Heisenberg Hamiltonian 



where we have taken the prefactor 1/2, together with h — 1, as defining our 
unit of time. Periòdic boundary conditions are imposed so that <Jn+i — 
Since Bell's theory operates in a preferred basis that defines the (discrete) 
classical configuration space over which trajectòries are propagated, let us 
select the tensor product basis of a z eigenstates for each spin. In particular, 
we will confine ourselves to the (dynamically invariant) subspace of a single 
spin excitation, i.e. the eigenspace of a z n with second lowest eigenvalue. 
This implies an iV-element set of classical configurations indexed by n, cor- 
responding to the possible locations of a single spin excitation on the ld 
circular lattice. 

With the above definitions, we see that (n\H\m) vanishes, hence so does 
z nm , unless \n — m\ < l(modiV). The jumping rates determined by (J4.16j) 
then imply that partides can jump only between neighboring sites on the 
lattice (in a single time-step Ai). 




(4.41) 




(4.42) 



71=1 
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Exact energies and eigenstates for (|4.42jl may be obtained in the single 
spin excitation subspace the eigenstates are found to be spin waves on 
the lattice: 

N 

\4>a) = J2 e2ma/N \ n ) 

71=1 

where a is an integer, with —N/2 < a < N/2, characterizing the energy 
E a = 4sm 2 (na/N) and direction of the spin wave. One can form quasi- 
coherent states of left/right-moving spin waves, which we will approximate 

as 

\oc)l/r = X] ex P 

a 

where left-movers (L) take the + sign and a sum over < a < N/2, and 
right-movers (R) take the — sign and a sum over —N/2 < a < 0. 
We can thus define our initial state at t = as a superposition 

|V(0)) = \a) L + \\-a) R . 

We will take a = a/6, which corresponds to coherent state distributions 
peaked around six spin wave quanta. The left-moving state is centered 
around the position n = at t = 0, while the extra 7r phase of the right-mover 
|— q)r shifts its center to n = N/2. The two first make contact around t = 7 
in our units, at which point they begin to interfere with each other. 

We illustrate the action of our algorithm, with two large windows of size 
At w = 6.05, iVtraj = 50 trajectòries, and a time step At = 0.018. Plotting 
the time dependence of the correlation coefficient between calculated and 
exact vàlues of lm{Z m (t)} in Fig. 14.11 at various stages in the iteration, we 
see how our inaccurate, static initial iterate is gradually ironed-out over the 
course of many iterations. (We have used Im{Z} and not Re{Z} here because 
the Im{Z(°)} vàlues are more stark in their deviation from the exact vàlues.) 
The poor quality of the static initial iterate together with the large 
window size used in this illustration necessitate many iterations (~ 1300 
per window) to achieve the level of accuracy shown in the figure. Shorter 
windows, relative to the natural system time-scale, will likely be preferable 
in most calculations. 

In general, the larger we make Atw, the more (temporally) global the 
algorithm can be, and the more accurate the algorithm is in calculating Z 
at some fixed time t. In the absence of an accurate, long-time zeroth iterate 
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Figure 4.1: The correlation coeficient Corzz between calculated and exact vàlues 
of lm{Zm (í)} at iteration i = 1 (dahsed), i = 120 (dotted), and i = 1360 (sòlid) 
for the first window t 6 (0,6.05) — and at i = 1 (dashed), i = 120 (dotted), and 
i = 1230 for the second window t £ (6.05, 12.1). 
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Figure 4.2: The number -/Vi tcr of iterations (convergence required at the 10~ 8 
level) necessary to achieve a given stability time t c , as we vary the window size 
At\y- We have used -/V tra j = 10 trajectòries, and a time step Aí = 0.018 here. 
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Fi gure 4.3: The stability time t c as a function of the time step Aí and number 
-^traj of trajectòries. We have used a window size Atw = 1.0 here. 

we can expect that lack of exact convergence over successive windows 
(and over longer times within each window) will cause an accumulation of 
errors that ultimately destabilize the algorithm by some time t c . We can 
somewhat arbitrarily peg this stability time as that at which the correlation 
coefficient between calculated and exact vàlues of lm{Z m (t)} first dips below 
some threshold, say 0.75. Increasing Atw wm tend to increase t c , but it 
will also require more iterations within each window to achieve a desired 
level of convergence, say to one part in 10 8 , as measured by the correlation 
between Ke{Zm (t w )} and Re{Zm (t w )} . We thus face a trade-off between 
the required number of iterations N íteT for some fixed Atw an d the stability 
time t c (see Fig.Ol). 

Similar trade-offs exist between t c on the one hand and on the other: (i) 
the number iV tra j of trajectòries used in our ensemble, and (ii) the size of 
the time step At. Fig. 14.31 gives an idea of the interplay between these two. 
The general lack of smoothness in these plots is the result of the algorithm's 
sensitivity to small changes in certain of its parameters. For instance, small 
changes in Atw may shift the window edges into or out of resonances in 
the dynamics; difficult moments in the simulation may respond differently 
depending on where they occur relative to these window edges. 

The results presented here are limited by the choice of a static initial 
iterate (|4.41|) . Ultimately this purification scheme will depend on the quality 
of that with which we begin the purification. What we have then is at least 
one stage in a migration of configuration space methods to the problem of 
simulating dynamics. 
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Chapter 5 

Beables for Quantum Control 



Beable methods offer two potential benefits in simulating quantum systems. 
We have already discussed how beables can help to define iterative numerical 
algorithms for simulating dynamics over an associated classical state space. 
But the beale framework also offers the possibility of a more concrete phys- 
ical interpretation of simulation results themselves. Moreover, when we are 
interested not only in properties of some final state but also qüestions con- 
cering exactly how that final state was arrived at, beables can be of use. 
One current research area in particular is prone to such qüestions: quantum 
control of molecular systems via pulse-shaped làsers. 

Advances in amplitude and phase modulation for ultrafast làsers, fast 
detection techniques, and their integration via closed-loop algorithms have 
made it possible to control the dynamics of a variety of quantum systems in 
the laboratory. Excitation may be either in the strong or weak field regime, 
with the goal of obtaining some desired final state. Success in achieving 
that goal is gauged by a detected signal (e.g., the mass spectrum in the case 
of selective molecular fragmentation), and this information is fed back into 
a learning algorithm [SE], which alters the làser pulse shape for the next 
round of experiments. High duty cycles of ~ 0.1 seconds or less per control 
experiment make it possible to iterate this process many times and perform 
eflicient experimental searches over a control parameter space defining the 
làser pulse shape. 

As an example of this process, experiments have employed closed-loop 
methods for selective fragmentation and ionization of orgànic [32] and organometal- 
lic |H3] |JT] compounds, as well as for enhancing optical response in solid-state 
and other chemical systems [12] |HI] jS] Yields of targeted species are 
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typically enhanced considerably over those obtained by non-optimized meth- 
ods. It is found that the optimal pulse shapes achieving these enhancements 
can be quite complicated, and understanding their physical significance has 
proven difficult. The same general observations also apply to the many opti- 
mal control design simulations carried out in recent years [1Ü] |Hj |H] IIH]- 
This chapter will address the identification of control mechanisms in the- 
oretical calculations as well as for direct application in the laboratory by 
offering a definition of mechanism in terms of beable trajectòries. 

5.1 Beables and Quantum Theory 

Consider a control problem posed in terms of the quantum evolution 

ihj t \m) = (Ho - ^E(t))\m) (s.i) 

over a finite dimensional Hilbert space with basis |n) where n = 0,1, 2,.... 
Here H(t) = Hq — fiE(t) incorporates the effect of the control field E(t) via 
the dipole moment operator fi, and we can explicitly follow the evolution of 
into a desired final state ^(íf)). 

We are concerned with the question: what is the importance of a given 
sequence ri\ — > n 2 of actual transitions — or, more specifically, of a 

given trajectory defined as a function n(t) of time — in achieving the desired 
final state \ip(tf))7 In other words, it is clear that the system is being driven 
into a desired state, but can we find a physical picture of how this is being 
accomplished? 

A conventional answer to the question raised above, essentially that given 
by Bohr jSl] on first seeing Feynman's path integral, is to reject the question 
as ill-posed because quantum mechanics is said to forbid consideration of 
precisely defined trajectòries over the classical state space {n}. Neverthe- 
less, it is well established that there exist dynamical models generating an 
ensemble of trajectòries n(t) whose statistical properties exactly match those 
associated with \4>(t)) at each t. In the case of a continuous state space, the 
first such model (as sketched in Chapter HJ) was that of de Broglie [5Ü] , later 
rediscovered and completed by Bohm j^üj- They reintroduce classical-like 
particle trajectòries into quantum theory by taking the probability current 
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J[ip] to describe a statistical ensemble of real partides. So, 



\ip\ 2 \ m t/>(x, í) J 



(5.2) 



gives the velocity of a particle with mass m and position x at time í, in de 
Broglie-Bohm (dBB) theory. The physical particle is taken to exist indepen- 
dently of, but also to have its motion determined by, the wavefunction ip. 
The time evolution of ip itself is just given by the Schrodinger equation. 

Bohm developed a full account of how ensembles of such classical-like 
partides could reproduce the predictions of quantum mechanics. A bàsic 
issue is to compare ^>(x, t) with the statistical distribution P(x, t) describing 
an ensemble of partides evolving by ([5.2)1 . One can show that if the initial 
distribution of partides satisfies P(x, 0) = ^(x, 0)| 2 , then P(x, t) = |^(x, t)\ 2 
will hold for all t > 0. That is, if the ensemble is initially in the "quantum 
equilibrium" distribution given by ^(x, 0)| 2 , the dynamics — (|5.2j) for the 
partides, and the Schrodinger equation for tp — will preserve this equilibrium, 
consistent with the predictions of Standard quantum theory 1201111] • The 
result is easily generalized to arbitrary interacting iV-particle systems by 
taking x as a point in the 3N dimensional configuration space. 

In dBB theory the position representat ion has a special status. While 
one may still regard if) as a basis-independent object, the particle dynamics 
is given by (|5.2|) specifically in terms of (x|^>) rather than (p\ip) or some 
other representation. But, it is easy to formulate analogs of dBB theory in 
different bases. For instance, one might choose the momentum vàlues p as 
the beables of the theory, and the dBB trajectòries x(í) would be replaced 
by momentum space trajectòries p(t). 

In the context of a finite dimensional Hilbert space with basis \n), the 
beables can be taken as the sites n of the classical state space {n} analogous 
to {x} or {p}. Some law analogous to ()5.2|) must be given to generate 
beable trajectòries n(t) over the state space. Such trajectòries would provide 
a physical picture of the quantum transitions induced by a control field E(t). 
John Bell's defmition (|4.16J1 of stochastic trajectòries over {n} is one such 
law. Moreover, it can be shown 55] to be minimal in the sense that any 
alternative law will require higher jump rates — in fact, these higher rates are 
such that the increased flux associated with jumping from n to m is found 
to exactly counterbalance that in the opposite direction. 

The answer to the initial question regarding the importance of a given 
trajectory in achieving the desired state \ip(tf)) is quite simple in Bell's theory. 
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The importance may be taken as just the path probability (j4.2r>j) . 

The argument given in Chapter §Hfor the equivalence of Bell's theory and 
ordinary quantum mechanics ensures that the path probabilities Prob('P) 
are consistent with the quantum distribution |-0 n (í)| 2 governing observables. 
But, it should be noted that Bell's theory is not unique in this regard. The 
rule (]4.16j) may be altered in non-trivial ways while preserving the master 
equation (J4.19|) [55] . although Bell's rule is minimal in the sense mentioned 
above. The definition ([4. 16]) might even be changed in ways that do not pre- 
serve ([4.19]) . if one is willing to relinquish a strict probability interpretation 
for the trajectòries [5*2] . 

In general, there are many different ways to assign probabilities to tra- 
jectòries that all result in the same time- dependent occupation probabili- 
ties P n (t). The predictions of quantum mechanics, therefore, cannot select 
a single assignment. This non-uniqueness at the root of quantum mecha- 
nism identification can be dealt with only by reference to the simplicity and 
explanatory power of a given mechanism definition. Below we adopt the 
definition (]4~Tïï|) . 



5.2 Simulat ing Beables in Quantum Control 



Our ultimate goal is to obtain dynamical mechanism information directly 
from experimental data associated with the closed-loop control field optimiza- 
tion, without pre-existing knowledge of the system Hamiltonian or wavefunc- 
tion. Methods employing Bell's theory for this purpose are presented in £15.41 
but first we shall study control mechanisms for a model system whose Hilbert 
space and quantum evolution are given explicitly in numerical simulations. 

Consider a quantum-optical system with level energies ftw n and dipole 
moments \i nm . Applying an external làser field E(t), the Hamiltonian in the 
interaction picture is 



where uj nm = u> n — uj m . We will drop the subscript / from now on. The 
definition here of \n) as interaction picture states has the affect of eliminating 
larger contributions to the jump probabilities T nm from the terms, hence 
reducing the overall frequency of jumps. E(t) is assumed to be given by 




(5.3) 



n m 
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an independent optimization algorithm designed to, for example, maximally 
transfer population from \n{) to |nf). 

A simple second-order Schrodinger propagator was used to solve (|5.1|) in 
the interaction picture, relying on a factorization of the evolution operator 

as 

AT-l , >. 

r { e -U**«*} = Y[rle^K P+lH{s)ds \ (5.4) 

where t p = pe = ptf/N and T is the time-ordering symbol. Choosing a time 
step e <^ H/ fiE, we can approximate (|5.4|) by dropping the T operations on 
the right hand side and computing the integrals directly. In doing this an 
error is accrued per time step given by the Baker-Hausdorf identity e A+B = 

e A e B e -l[A,B]+- ag 



jf [H(r),H(s)}drds ~ (^-j e 3 uj . (5.5) 

The right hand estimate is obtained by expanding H(r) to first order about 
r = s and noticing that the E'(s) term in H'(s) commutes with H(s). The 
error (|5.5|) would generally dominate third order terms like (fiEe/h) 3 . 
If the control field is given as E{t) = Re{^ a>iEi(t)}, where 

Ei{t) = AtyjWà*** 

with A(t) and <f>(t) possibly adiabatic, we can evaluate J H(s)ds by writing 

t v+ 1 i, A(+ \p' l <t>( t p) 

nEAs)è" s ds « ^ } p) . (ei(^)Wi _ e i(^+^) . (5.6) 
i(w + ^ c ) 

(Simply writing J H(s)ds ~ eH{t p ) is not appropriate because we do not 
want to exclude weak field excitation, i.e. fiE <C Hu, so that ue ~ 1 may 
hold.) Thus in the adiabatic case \ip{t)) can be propagated in steps deter- 
mined by A(t) and <j)(t) rather than the phase factors e luJt . 

Consider the evolution of beable trajectòries according to (j4.16j) . which 
appears to require a time step small enough that each part of H, including the 
giuí Vernís, not vary much over the step. Nevertheless, the total probability 
of jumping from m to n over (t p ,t p+ i) is given by the integral J T nm (s)ds 
over that range with ~ (fiEe/H) 2 corrections. Thus we can take an effective 
jump probability for the interval (t p ,t p+ i) as given by (|4.16|) with 

Znm(tp) ~ — 77 . ~r~ í H nm {s)ds (5-7) 

tTTI \ p ) J tp 



CHAPTER 5. BEABLES FOR QUANTUM CONTROL 



86 



2.13 . -.2.25 
4 5 



1.56 . '1.69 

3 

1.45. ,1.27 

1 2 \ 

0.854- 0.97 




Figure 5.1: The model 7-level system |n) with n = 0,1,..., 6. The transition 
freqüències io nm in units of fs -1 are shown on the right, and non-zero dipole matrix 
elements /i nm in units of 10" 30 C •m are indicated by dotted lines. 

evaluated using ()5.6|) . Note that we have included a factor of 1 /h explicitly 
into the definition í)4.17j) of z nm . If ue <C 1 does not hold, care must be 
taken to extend the integration in (|5.7J) only over t G (t p ,t p+ i) for which 
Re{2; nm (t)} > 0, leading to additional boundary terms in the phase difference 
part of ()5.fij) . Moving the ip* ratio outside the integral in ()5.7j) produces an 
error per time step of order 

e 2 Hdip Í^EeV 
~JT~dt ~ \~^J 

which is again comparable to (|5.5j) . Therefore beable trajectòries may be 
propagated in steps determined by the possibly adiabatic amplitude A(t) 
and phase <f>(t), i.e. synchronously with the Schrodinger propagator. 

5.3 Mechanism Analysis for a Model 7-Level 
System 

The beable trajectory methodology for identification of control mechanisms 
will be illustrated with a 7-level system where u n and \x nm are given in Fig. 
15.11 The (non-adiabatic) control field E(t) shown in Fig. 15.21 is obtained 
from a steepest descents algorithm over the space of field histories [S3 . It 
is optimized to transfer population from the ground state |0) to the highest 
excited state |6). By t = 100 fs, the transfer is found to be completed with 
approximately 97% emciency (see Fig. 15. 3|) . 

Together with the second-order Schrodinger propagator, using time step 
e = .025 fs, an ensemble of iV tra j = 10 5 beable trajectòries is evolved, all 
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Figure 5.2: Electric field E(t) in V/A obtained from an optimization algorithm 
for population transfer from |0) to |6) [53] . 

starting in the ground state n = at t = 0. At each time step, a given 
beable at site m is randomly made either to jump to a neighboring site 
n 7^ m according to the probabilities T nm e given by (j4.16|) with ()5.7|) . or 
else stay at m. Four sample trajectòries are shown in Fig. 15.41 As a check, 
one can count the number of beables residing on each site n at time t to 
estimate the occupation probabilities P n (t) and verify that they match the 
quantum prescriptions |^ n (í)| 2 . The finite-ensemble deviations are observed 
to be consistent with a (-Ar tra j) -1 / 2 convergence law. 

About 60% of the trajectòries generated are found to involve four jumps, 
and of these the trajectòries passing through sites n = 2, 5 are noticeably 
more probable than those passing through n = 1,4. 6-jump trajectòries 
comprise about 30% of the ensemble. And it becomes increasingly less likely 
to find trajectòries with more and more jumps. The largest number of jumps 
observed in a single trajectory was 14. Three such trajectòries occurred out 
of the ensemble total 10 5 . 

A natural expectation is that the optimal field E(t) would concentrate 
on the higher probability trajectòries and not waste much effort on guiding 
highly improbable trajectòries, such as the 14-jumpers, to the target state 
n = 6, as the latter have essentially no impact on the control objective (final 
population of the target state). Interestingly, though, the vast majority of 
even the lowest probability trajectòries are still guided to n = 6. Apparently, 
the optimal field is able to coordinate its effect on low probability trajectòries 
with that on other trajectòries at no real detriment to the latter. We shall 
come back to this point later. 
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Figure 5.3: Population |^6(í)| 2 as a function of time (fs). Detail for small t is 
shown in the inset (same units). 

One way to conveniently categorize the large set of trajectòries, each 
expressible as a sequence of time-labeled jumps (íi,ni) — > (£2,^2) 
is to drop the time labels, leaving only the "pathway" n\ — > n 2 — > 
The importance of a given pathway is then computed as the frequency of 
trajectòries associated with that pathway. Table 15.11 lists some important 
and/or interesting pathways and their probabilities. 

Fig. 15.51 shows some typical trajectòries associated with the first and fifth 
pathways listed in Table 15.11 — involving 4 and 6 jumps respectively. E(t) 
guides the 4-jumpers upward in energy, and they begin to arrive at n = 6 
around t = 80 fs, early enough that stragglers can catch up but too late for 
the over-achievers of the group to head off elsewhere. This corresponds to 
the onset of heavy growth for |-?/> 6 (í)| 2 around t = 80 fs (see Fig. 15. 3j) . The 
6-jumpers first reach n = 6 around t = 50 fs, but almost all fall back to 
n = 5 by t = 80 fs, reuniting with the 4-jumpers just as they begin to jump 
up to n = 6. These 6-jumpers, along with other high-order contributions, 
thus explain the small surge in |^6(t)| 2 between 50 and 80 fs. Another much 
smaller surge around t = 30 fs and one still smaller around t = 20 fs (see inset 
of Fig. I5.3f) are attributable to 8-th and higher order trajectòries "ringing" 
back and forth on 5 <-> 6. 

For t G (70 fs, 80 fs), many of the 6-jumpers are at n = 6 and need to 
be de-excited on the 6-^5 transition before they can jump back up to 
n = 6. Simultaneously, many of the 4-jumpers are at n = 5 and should not 
be prematurely excited on 5 — > 6, lest they not remain at n = 6 through 
t = 100 fs. The optimal field thus faces a conundrum: how to stimulate 
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probability 


pathway 


0.19 


2 3 5 6 


0.16 


2 3 4 6 


0.14 


13 5 6 


0.12 


13 4 6 


0.018 


2 3 5 6 5 6 


0.005 


2 


0.0007 


023564356 



Table 5.1: The five most probable pathways, followed by the highest probability 
pathway failing to reach n = 6 at t = 100 fs, and then the highest probability 
pathway involving a topologically non-trivial cycle in state space. The fractional 
error in the pathway probability P is given roughly by (10 5 P) -1 / 2 . 
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Figure 5.4: One each of the 4, 6, 8, and 10-jump trajectòries generated by the 
jump rule (|4.16f) are shown with their sites n plotted against time. For viewing 
purposes, we have displaced them a small amount vertically from each other and 
tilted the jump lines slightly away from the vertical. 
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Figure 5.5: A sample of 20 trajectòries each from the pathways 2 3 5 6 and 
2 3 5 6 5 6. 



the 2 «-> 6 transition preferentially for the 6-jumpers (in n = 6) over the 
4-jumpers (in n = 5). The means by which this feat is accomplished may be 
understood by reference to the jump rule (|4.16|) . E(t) induces jumps through 
the explicit H nm (t) factor but also through the ip* quotient, which depends 
on E(t) through ()5.1|) . In particular, ()4.18|) implies that at any one time 
t jumps on this transition must be either all upward or all downward. The 
active direction is switched back and forth according to the sign of Re{z 65 (í)}. 

Fig. l5.6l plots \E(t) \ and Rej^s^)}, which controls the upward jump rate 
T 65 (t). For t e (70 fs, 80 fs) one sees that when \E(t)\ is large, most often 
Re{ z 65(t)} dips below zero, disallowing any upward jumps. The correlation 
coefficient between \E(t)\ and Re{z 65 (í)} in this range is —0.4955. On the 
other hand, the correlation between l-E^t)! and Refent)}, which controls 
downward jumping, is +0.4475 over the same range. 

Looking at the trajectòries in more detail, one notices a distinct bunching 
of jumps. Beables tend to jump together in narrow time bands, or else to 
abstain in unison from jumping. This behavior can be gauged by calculating 
the two-time jump-jump correlation function: 

1 N ~ 1 

p=0 

where Jn(t) is the number of jumps of type Q occurring in (t,t + e), and 
íl is a subset of the entire ensemble of trajectòries. For instance, the two- 
time function with íl taken as the set of jumps on the 5^6 transition 
is plotted in Fig. 15 .71 The fs time-scale oscillations correspond to the level 
splittings u nm and the dominant frequency components of E(t). Enhanced 
correlations around r = correspond to the jump bunching noticeable in 



CHAPTER 5. BEABLES FOR QUANTUM CONTROL 



91 



0.06 
0.04 
. 02 



-0 . 02 
-0 .04 

70 72 74 76 78 80 

t (fs) 

Figure 5.6: The optimal field modulus \E(t)\ (dotted line) and Re{z65(í)} (full 
line) in fs -1 over the range (70 fs, 80 fs). Their anticorrelation causes beables to 
be preferentially selected for the downward transition 6^5 over the upward 
transition 5^6. 

the trajectòries. Two side-bands around r = ±40 fs are associated with 
6-jump and higher order trajectòries triat go up, down, and up again on 
5^6 over the approximate time window (50 fs, 90 fs). This conclusion can 
be verified by computing two-time functions with íl specialized to particular 
pathways. Other much smaller features for |r| > 60 fs (see inset of Fig. 15. 7|) 
are attributable to higher order trajectòries ringing on 5 <-> 6. 

In general, the fs oscillations characteristic of these two-time functions 
show that E(t) works in an essentially discrete way, turning on the flow of 
beables over a given transition and then turning it off with a duty cycle of ~ 
2 fs. The associated bandwidth of ~ 0.5 fs -1 is small enough to discriminate 
between all non-degenerate ui nm except between uj 35 (= u^) and u 56 (= u^q), 
which differ by only 0.12 fs -1 . This circumstance leaves effectively three 
distinguishable transitions. With a total time of 100 fs, the control field E(t) 
can potentially enact roughly 150 separate flow operations. The fact that 
trajectòries with pathway probability <C 1% are still almost always guided 
successfully to n = 6 suggests that these ~ 150 operations are more than 
necessary to obtain the 97% success rate achieved by the optimal control 
algorithm in this simulation. It appears that the algorithm actively sweeps 
these aberrant trajectòries back into the mainstream so as to maximize even 
their minute contribution to the control objective. 
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Figure 5.7: Jump correlation function '(t) associated with jumps on 5 — > 6, 
plotted against the delay time r (fs) for the ensemble of 10 5 trajectòries. Detail 
for large r is shown in the inset (same units). 

5.4 Control Mechanism Identification in The 
Laboratory 

Using these beable trajectory methods to extract mechanism information di- 
rectly from closed-loop data is complicated by the fact that we cannot assume 
knowledge of a time-dependent wavefunction, Hamiltonian, or possibly even 
the energy level structure of the system. Frequently in the laboratory, the 
only available information consists of final state population measurements 
and knowledge of the control field E(t). 

The following analysis aims to show how a limited statistical characteriza- 
tion of beable trajectòries may be generated from laboratory data associated 
with a given optimal control field. In particular, we will show how to extract 
jmin, the minimum number of jumps necessary to reach the final state rif 
from the initial state m; also (jv), the average number of such jumps over 
an ensemble of beable trajectòries; and possibly higher moments {(jp) k ) as 
well. After a general formulation of this analysis is presented, it will be ap- 
plied to simulated experimental data in the case of the model 7-level system 
considered above. 

We propose to obtain mechanism information by examining the effect on 
the final state population \ip n{ (íf)| 2 of variations in the control field away 
from optimality. Detecting only this one population, i.e. the control objec- 
tive itself, limits how much mechanism information we can gain. It may be 
possible to obtain a more detailed understanding of a given mechanism by 
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probing additional aspects of the system at times other than just íf, and this 
may be done either in our present framework or through extensive spectro- 
scopic methods. One expects there will always be a trade-off between the 
complexity of such methods and the level of mechanistic detail they reveal. 

Consider a particularly simple scheme wherein the amplitude of the con- 
trol field is modulated by a constant Ai independent of time: 

E(t) É(t) = ME(t) 

giving rise to a new time- dependent solution \ip(t)) — in particular, a new 
final state population ^^(íf)! 2 and new path probabilities Prob("P). These 
quantities are obtained by taking T nm — > T nm in ()4.23j) . which is to say using 
E{t) and ip n (t) in the jump rule (j4.1fi|) . 

To express Prob(P) in terms of Prob(P), we can write 



Yl T n P+ m p - M 3V Yl T np+inp Y[ c( ^ ( _ 

pgJ peJ peJ p 



cos©„ 
— x 



l-T |^n p+1 (*p)| / -pr |^n p+1 fa)| \ ' 

where j-p is the number of jumps in V and 

b p = arg ( -iH np+inp p - 

Yn p V>p, 

To simplify ()5.8|) . note that if jp were very large, then successive terms 
in each of the last two products would tend to cancel, leaving only endpoint 
contributions. Making the reasonable approximation that they do completely 
cancel yields 

ITT ~ M* ^ nf(tf)l TTT TT^^ (59) 

Further, we can make the expansion 

_ ln TJ^ÉE = a ^\M-l) + af{M-lf + --- 
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about Ai — 1, where the depend on the path V but not on Ai. And 
similarly: 



= ^jE' n T nnp + b ( i\M - 1) + 



Combining these expansions gives a relationship between the path probabili- 
ties Prob("P) in the modulated case and those, Prob("P), in the unmodulated 
case, which are the ones containing mechanism information regarding the 
actual optimal control field E(t). We can thus write the final population as 



l<M* f )| 2 = ^|^ (0)| 2 Prob(P) 



v 



« E \Í>nM\ 2 M^e-^ M -^ Prob(P) 

where ap = + fop , and higher order terms in the expansion have been 
dropped. (This approximation is not as crude as it might seem, since for 
small Ai away from 1, the behavior of |"0n f (íf)| 2 is dominated by the A\? T 
factor.) Cancelling one power of |"0 nf (íf)|, and recalling that the sum is taken 
only over paths ending on n = n t so that |^ nf (if)| 2 = ^pProb(P), we have 

fe(*f)| « \^n { (tf)\(M^e- a ^ M ^) (5.10) 

where (• • •) denotes an average over the trajectory ensemble generated by 
the (unmodulated) optimal field E(t). Beables in this ensemble are taken as 
initially distributed at t = according to |^ n (0) | 2 , and only trajectòries that 
successfully reach n = n { at t = t { are counted. 

Note that for Ai close enough to 0, the minimum value j m { n taken on by 
j<p will dominate the expectation value in (|5.10|) . and 

ln|^ f (í f )| = Jmin \nAi + 0(1) (5.11) 

gives the dominant behavior independent of a-p. If we suppose that ap, where 
it is relevant, depends primarily on the endpoints of V, which are fixed, and 
only weakly on the rest of the path, then ap can be approximated by some 
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Figure 5.8: The best fit of (|5.12|) to the simulated |V> nf (íf)| 2 data (10% noise) as 
a function of Ai; it occurs over the fitting range Ai E (.44, .92). 

characteristic value a. Putting Ai^ v = eP v in (|5.1()jl and expanding in 
powers of ln Ai now gives 

|</V(í f )| » l^^le-^-^f^i^an^í)^ (5.12) 

k=0 

for the final state population under a modulated field, expressed in terms of 
the desired statistical properties of the trajectory ensemble under the optimal 
field itself. Here, a enters as an additional parameter that must be extracted 
from the data. Equations ()5.11|) and (J5.12J1 form the working relations to 
extract mechanism information from laboratory data. 




5.5 Simulated Experiments on a 7-Level Sys- 
tem 

In order to extract quantities like (j-p) using the results ()5.11j) and (|5.12j) data 
must be generated for the final state population \ip n( (t{)\ 2 at many vàlues of 
the modulation factor AA over some range (Ad m in, M. 

max ) rs ~ J 

(0,1.5). The 

desired quantities are obtained as parameters in fitting ()5.11|) and (J5.12|) to 
the data as a function of Ai. 

One set of simulated data for the above 7-level system is shown in Fig. 
15.81 the sampling increment is AM = .01. Noise has been introduced by mul- 
tiplying the exact \ip ni {t{)\ 2 vàlues by an independent Gaussian-distributed 
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Figure 5.9: The derivative is calculated from simulated data with noise level 
a = .1; its limiting value as ln.M — > — oo gives j m in- 

random number for each value of Ai, where the distribution is chosen to have 
mean 1, and various Standard deviations a have been sampled. 

We can determine j m i n from the data using í)5.11j) . which implies 

j m ,„ = Mm . (5.13) 

For instance, Fig. 15.91 plots the derivative in (j5.13|) . calculated with finite 
differences from the \ip nf (íf)| 2 simulated data for o = .1, which correctly 
gives j min = 4 as the limiting value. Determination of j min proved robust to 
multiplicative Gaussian noise up to the 40% level (a = .4). 

The quantity (jp) is more dimcult to extract, because while the sum in 
(I5.12J) converges to as Ai — > 0, the terms of the sum individually diverge 
and must cancel in a delicate manner. Therefore truncating the sum to an 
upper limit /c max becomes a very bad approximation near Ai = 0. This unsta- 
ble behavior can be controlled by carefully setting the range (Ai m m, A^max) 
of data to be fltted, given a choice of fc mffi . 

It is also convenient to constrain the fit by the previous determination of 
Jmin = 4. We have done this by noting that if E(t) is truly optimal, then 
|-0 nf (íf)| must have a maximum at Ai — 1, which implies that a = (j-p)- 
This can be used as a weaker constraint on the auxiliary parameter a by just 
requiring a > j min = 4 in the fit without necessarily supposing that E(t) is 
exactly optimal. We then check that a m (j-p) is satisfied in the fit. Fig. 15.81 
shows one such fit where the fitting range is Ai G (.44, .92). One can see 
that the fit closely tracks the data for Ai in this range but quickly diverges 
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from the data just below Ai = .44 (and, less severely, above Ai = .92) due 
to the sum-truncation instability mentioned previously. 

In order to identify appropriate ranges in general, we have searched over 
all combinations such that 



Mathematica's implementation of the Levenberg-Marquardt non-linear fit- 
ting algorithm was used on simulated data for each value of a between and 
.5 with a .01 increment. The best fit at each a was used to determine the 
value of (j-p) most consistent with the simulated data at the given noise level. 

For this analysis & max = 4 was chosen somewhat arbitrarily to balance 
computational cost and precision. In practice it is likely that the moments 
((jp) k ) for lower k vàlues will be most reliably extracted from the data, 
especially considering the laboratory noise. In the simulations it was found 
that (j-p) could be reliably extracted, but higher moments were unstable 
and unreliable. For example, ({jv) 2 ) was frequently found to lie slightly 
below the corresponding fit vàlues for (j-p) 2 , which is inconsistent with the 
interpretation of these vàlues as statistical moments of an underlying random 
variable jp. Further constraints could be introduced to attempt to stabilize 
the extraction of higher moments, but care is needed so as not to overfit the 
data. 

Fig. 15.101 shows the (jp) vàlues obtained by fitting data with o = .1 for 
each choice of (A4 m i n , Ai max ) and Fig. 15.111 shows the corresponding quality 
of each fit as measured by its mean squared deviations. In Fig. 15.101 as well 
as in the corresponding plots for all other vàlues of a studied, two diagonal 
strips emerge running above a set of smaller islands. The surrounding white 
"sea" comprises fits that give (jp) < 4, which we know to be ruled out by 
the determination of j mm . 

A virtually identical pattern arises in the fit quality plots. The two strips 
and underlying islands are seen to give much better fits than the white sea. 
An additional connected region of good fits is found to extend across the 
lower-left còrner of Fig. 15.111 nearly all of which are ruled out by j m i n = 4. 
This connected region is somewhat pathological because much of it cor- 
responds to fitting ranges that fail to capture the important behavior of 
|"0n f (íf)| 2 near Ai — 1, and therefore can be ignored. Then the best fits 
for all vàlues of a sampled are found to come from the cluster of islands 




min 



< .8 

< 1.6 
, > 10 



(5.14) 
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Figure 5.10: Fit vàlues for (j-p) over a set of different fitting ranges (Aí m in, Animin); 
a = .1 here. 
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Figure 5.11: Fit qualities as measured by the inverse of the mean squared devia- 
tions between the simulated data and the fit; a = .1 here. The highest fit quality 
appears at (.44, .92). 
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Figure 5.12: Best fit vàlues for (j-p) as a function of the noise level a, compared 
to the exact value (j-p) = 4.907 (gray line). 

at -M max ~ .95. As o is increased from to .5, these islands flow from 
-^ímin ~ .5 to .Mmin ~ .3, carrying with them the best fit site. Note that the 
small triangular area in the lower right còrner, most noticeable in Fig. 15.111 
is a region excluded from consideration by the third constraint in (|5.14|) . 

The best fit vàlues of (j-p) are shown as a function of a in Fig. 15.121 These 
vàlues are to be compared with the exact result (j-p) = 4.907 obtained from 
the trajectory ensemble calculations in £ jSHÜ which require explicit knowledge 
of the level structure and dipole moments fi nm of the system. The ramping 
behavior in Fig. 15.121 results from the sampling increment AA4 of the simu- 
lated data. Transitioning between one ramp and another corresponds to the 
shifting of the best fit location by one or two units of AAi. 

These (jp) vàlues are in good agreement (3% discrepancy) with the exact 
value for noise at the level of 0-25%. It should be noted that a qualita- 
tive change occurs in the case of no noise (er = 0), where the islands all 
disappear and the strips become extended much further on the downward 
diagonal. Inspecting the fits individually indicates that mean squared devia- 
tion does not give an adequate measure of fit quality in this special case. This 
anomaly seems due to the fact that, in the absence of Gaussian noise from 
experimental statistics, systematic deviations from ()5.12|) associated with the 
approximation ()5.9|) become important. 

Since Bell's model can be defined for any choice of basis |n), there is a 
more general question of how the above kind of mechanism analysis might 
vary with the choice of basis. Beyond that, Bell's jump rule (|4.1fi|) itself per- 
mits gener alization providing additional freedom over which trajectory 



CHAPTER 5. BEABLES FOR QUANTUM CONTROL 



100 



probability assignments may vary. The import of this freedom for mechanism 
identification remains to be determined. 
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